Efficient Sampling for Learning Sparse Additive Models in High Dimensions

Size: px
Start display at page:

Download "Efficient Sampling for Learning Sparse Additive Models in High Dimensions"

Transcription

1 Efficient Samping for Learning Sparse Additive Modes in High Dimensions Hemant Tyagi ETH Zürich Andreas Krause ETH Zürich Bernd Gärtner ETH Zürich Abstract We consider the probem of earning sparse additive modes, i.e., functions of the form: f() = S φ ( ), R d from point queries of f. Here S is an unknown subset of coordinate variabes with S = k d. Assuming φ s to be smooth, we propose a set of points at which to sampe f and an efficient randomized agorithm that recovers a uniform approimation to each unknown φ. We provide a rigorous theoretica anaysis of our scheme aong with sampe compeity bounds. Our agorithm utiizes recent resuts from compressive sensing theory aong with a nove conve quadratic program for recovering robust uniform approimations to univariate functions, from point queries corrupted with arbitrary bounded noise. Lasty we theoreticay anayze the impact of noise either arbitrary but bounded, or stochastic on the performance of our agorithm. 1 Introduction Severa probems in science and engineering require estimating a rea-vaued, non-inear (and often non-conve) function f defined on a compact subset of R d in high dimensions. This chaenge arises, e.g., when characterizing compe engineered or natura (e.g., bioogica) systems [1, 2, 3]. The numerica soution of such probems invoves earning the unknown f from point evauations ( i, f( i )) n i=1. Unfortunatey, if the ony assumption on f is of mere smoothness, then the probem is in genera intractabe. For instance, it is we known [4] that if f is C s -smooth then n = Ω((1/δ) d/s ) sampes are needed for uniformy approimating f within error 0 < δ < 1. This eponentia dependence on d is referred to as the curse of dimensionaity. Fortunatey, many functions arising in practice are much better behaved in the sense that they are intrinsicay ow-dimensiona, i.e., depend on ony a sma subset of the d variabes. Estimating such functions has received much attention and has ed to a considerabe amount of theory aong with agorithms that do not suffer from the curse of dimensionaity (cf., [5, 6, 7, 8]). Here we focus on the probem of earning one such cass of functions, assuming f possesses the sparse additive structure: f( 1, 2,..., d ) = S φ ( ); S {1,..., d}, S = k d. (1.1) Functions of the form (1.1) are referred to as sparse additive modes (SPAMs) and generaize sparse inear modes to which they reduce to if each φ is inear. The probem of estimating SPAMs has received considerabe attention in the regression setting (cf., [9, 10, 11] and references within) where ( i, f( i )) n i=1 are typicay i.i.d sampes from some unknown probabiity measure P. This setting, however, does not consider the possibiity of samping f at specificay chosen points, taiored to the additive structure of f. In this paper, we propose a strategy for querying f, together with an efficient recovery agorithm, with much stronger guarantees than known in the regression setting. In particuar, we provide the first resuts guaranteeing uniformy accurate recovery of each individua component φ of the SPAM. This can be crucia in appications where the goa is to not merey approimate f, but gain insight into its structure. 1

2 Reated work. SPAMs have been studied etensivey in the regression setting, with observations being corrupted with random noise. [9] proposed the COSSO method, which is an etension of the Lasso to the reproducing kerne Hibert space (RKHS) setting. A simiiar etension was considered in [10]. In [12], the authors propose a east squares method reguarized with smoothness, with each φ ying in an RKHS and derive error rates for estimating f, in the L 2 (P) norm 1. [13, 14] propose methods based on east squares oss reguarized with sparsity and smoothness constraints. [13] proves consistency of its method in terms of mean squared risk whie [14] derives error rates for estimating f in the empirica L 2 (P n ) norm 1. [11] considers the setting where each φ ies in an RKHS. They propose a conve program for estimating f and derive error rates for the same, in the L 2 (P), L 2 (P n ) norms. Furthermore they estabish the minima optimaity of their method for the L 2 (P) norm. For instance, they derive an error rate of O((k og d/n) + kn 2s 2s+1 ) in the L 2 (P) norm for estimating C s smooth SPAMs. An estimator simiar to the one in [11] was aso considered by [15]. They derive simiar error rates as in [11], abeit under stronger assumptions on f. There is further reated work in approimation theory, where it is assumed that f can be samped at a desired set of points. [5] considers a setting more genera than (1.1), with f simpy assumed to depend on an unknown subset of k d-coordinate variabes. They construct a set of samping points of size O(c k og d) for some constant c > 0, and present an agorithm that recovers a uniform approimation 2 to f. This mode is generaized in [8], with f assumed to be of the form f() = g(a) for unknown A R k d ; each row of A is assumed to be sparse. [7] generaizes this, by removing the sparsity assumption on A. Whie the methods of [5, 8, 7] coud be empoyed for earning SPAMs, their samping sets wi be of size eponentia in k, and hence sub-optima. Furthermore, whie these methods derive uniform approimations to f, they are unabe to recover the individua φ s. Our contributions. Our contributions are threefod: 1. We propose an efficient agorithm that queries f at O(k og d) ocations and recovers: (i) the active set S aong with (ii) a uniform approimation to each φ, S. In contrast, the eisting error bounds in the statistics community [11, 12, 15] are in the much weaker L 2 (P) sense. Furthermore, the eisting theory in both statistics and approimation theory provides epicit error bounds for recovering f and not the individua φ s. 2. An important component of our agorithm is a nove conve quadratic program for estimating an unknown univariate function from point queries corrupted with arbitrary bounded noise. We derive rigorous error bounds for this program in the L norm that demonstrate the robustness of the soution returned. We aso epicity demonstrate the effect of noise, samping density and the curvature of the function on the soution returned. 3. We theoreticay anayze the impact of additive noise in the point queries on the performance of our agorithm, for two noise modes: arbitrary bounded noise and stochastic (iid) noise. In particuar for additive Gaussian noise, we show that our agorithm recovers a robust uniform approimation to each φ with at most O(k 3 (og d) 2 ) point queries of f. We aso provide simuation resuts that vaidate our theoretica findings. 2 Probem statement For any function g we denote its p th derivative by g (p) when p is arge, ese we use appropriate number of prime symbos. g L [a,b] denotes the L norm of g in [a, b]. For a vector we denote its q norm for 1 q by q. We consider approimating functions f : R d R from point queries. In particuar, for some unknown active S {1,..., d} with S = k d, we assume f to be of the additive form: f( 1,..., d ) = S φ ( ). Here φ : R R are the individua univariate components of the mode. Our goa is to query f at suitaby chosen points in its domain in order to recover an estimate φ est, of φ in a compact subset Ω R for each S. We measure the approimation error in the L norm. For simpicity, we assume that Ω = [ 1, 1], meaning that we guarantee an upper 1 f 2 L 2 (P) = R f() 2 dp() and f 2 L 2 (P n) = 1 n P i f 2 ( i) 2 This means in the L norm 2

3 bound on: φ est, φ L [ 1,1] ; S. Furthermore, we assume that we can query f from a sight enargement: [ (1 + r), (1 + r)] d of [ 1, 1] d for 3 some sma r > 0. As wi be seen ater, the enargement r can be made arbitrariy cose to 0. We now ist our main assumptions for this probem. 1. Each φ is assumed to be sufficienty smooth. In particuar we assume that φ C 5 [ (1 + r), (1 + r)] where C 5 denotes five times continuous differentiabiity. Since [ (1 + r), (1 + r)] is compact, this impies that there eist constants B 1,..., B 5 0 so that ma S φ(p) L [ (1+r),(1+r)] B p ; p = 1,..., 5. (2.1) 1 2. We assume each φ to be centered in the interva [ 1, 1], i.e. 1 φ (t)dt = 0; S. Such a condition is necessary for unique identification of φ. Otherwise one coud simpy repace each φ with φ + a for a R where a = 0 and unique identification wi not be possibe. 3. We require that for each φ, I [ 1, 1] with I being connected and µ(i ) δ so that φ () D ; I. Here µ(i) denotes the Lebesgue measure of I and δ, D > 0 are constants assumed to be known to the agorithm. This assumption essentiay enabes us to detect the active set S. If say φ was zero or cose to zero throughout [ 1, 1] for some S, then due to Assumption 2 this woud impy that φ is zero or cose to zero. We remark that it suffices to use estimates for our probem parameters instead of eact vaues. In particuar we can use upper bounds for: k, B p ; p = 1,..., 5 and ower bounds for the parameters: D, δ. Our methods and resuts stated in the coming sections wi remain unchanged. 3 Our samping scheme and agorithm In this section, we first motivate and describe our samping scheme for querying f. We then outine our agorithm and epain the intuition behind its different stages. Consider the Tayor epansion of f at any point ξ R d aong the direction v R d with step size: ɛ > 0. For any C p smooth f; p 2, we obtain for ζ = ξ + θv for some 0 < θ < ɛ the foowing epression: f(ξ + ɛv) f(ξ) = v, f(ξ) + 1 ɛ 2 ɛvt 2 f(ζ)v. (3.1) Note that (3.1) can be interpreted as taking a noisy inear measurement of f(ξ) with the measurement vector v and the noise being the Tayor remainder term. Importanty, due to the sparse additive form of f, we have φ 0, / S, impying that f(ξ) = [φ 1(ξ 1 ) φ 2(ξ 2 )... φ d (ξ d)] is at most k-sparse. Hence (3.1) actuay represents a noisy inear measurement of the k-sparse vector : f(ξ). For any fied ξ, we know from compressive sensing (CS) [16, 17] that f(ξ) can be recovered (with high probabiity) using few random inear measurements 4. This motivates the foowing sets of points using which we query f as iustrated in Figure 1. For integers m, m v > 0 we define { X := ξ i = i } (1, 1,..., 1) T R d : i = m,..., m, (3.2) m { V := v j R d : v j, = ± 1 } w.p. 1/2 each; j = 1,..., m v and = 1,..., d. (3.3) mv Using (3.1) at each ξ i X and v j V for i = m,..., m and j = 1,..., m v eads to: f(ξ i + ɛv j ) f(ξ i ) 1 = v j, f(ξ i ) + }{{ ɛ }}{{} 2 ɛvt j 2 f(ζ i,j )v j, (3.4) y i }{{} i,j n i,j 3 In case f : [a, b] d R we can define g : [ 1, 1] d R where g() = f( (b a) + b+a ) = P φ 2 2 S ( ) with φ ( ) = φ ( (b a) 2 + b+a ). We then sampe g from within [ (1 + r), (1 + 2 r)]d for some sma r > 0 by querying f, and estimate φ in [ 1, 1] which in turn gives an estimate to φ in [a, b]. 4 Estimating sparse gradients via compressive sensing has been considered previousy by Fornasier et a. [8] abeit for a substantiay different function cass than us. Hence their samping scheme differs consideraby from ours, and is not taiored for earning SPAMs. 3

4 where i = f(ξ i ) = [φ 1(i/m ) φ 2(i/m )... φ d (i/m )] is k-sparse. Let us denote V = [v 1... v mv ] T, y i = [y i,1... y i,mv ] and n i = [n i,1... n i,mv ]. Then for each i, we can write (3.4) in the succinct form: y i = V i + n i. (3.5) Here V R mv d represents the inear measurement matri, y i R mv denotes the measurement vector at ξ i and n i represents noise on account of non-inearity of f. Note that we query f at X ( V + 1) = (2m + 1)(m v + 1) many points. Given y i, V we can recover a robust approimation to i via 1 minimization [16, 17]. On account of the structure of f, we thus recover noisy estimates to φ at equispaced points aong the interva [ 1, 1]. We are now in a position to formay present our agorithm for earning SPAMs. Our agorithm for earning SPAMs. The steps invoved in our earning scheme are outined in Agorithm 1. Steps 1-4 invove the CS-based recovery stage wherein we use the aforementioned samping sets to formuate our probem as a CS one. Step 4 invoves a simpe threshoding procedure where an appropriate threshod τ is empoyed to recover the unknown active set S. In Section 4 we provide precise conditions on our samping parameters which guarantee eact recovery, i.e. Ŝ = S. Step 5 everages a conve quadratic program (P), that uses noisy estimates of φ (i/m ), i.e., i, for each Ŝ and i = m,..., m, to return a cubic spine estimate φ. This program and its theoretica properties are epained in Section 4. Finay, in Step 6 we derive our fina estimate φ est, via piecewise integration of φ for each Ŝ. Hence our fina estimate of φ is a spine of degree 4. The performance of Agorithm 1 for recovering S and the individua φ s is presented in Theorem 1, which is aso our first main resut. A proofs are deferred to the appendi. ( ) ( ) Figure 1: The points ξ i X (bue disks) and ξ i + ɛv j (red arrows) for v j V. Agorithm 1 Agorithm for earning φ in the SPAM: f() = S φ ( ) 1: Choose m, m v and construct samping sets X and V as in (3.2), (3.3). 2: Choose step size ɛ > 0. Query f at f(ξ i ),f(ξ i +ɛv j ) for i = m,..., m and j = 1,..., m v. 3: Construct y i where y i,j = f(ξi+ɛvj) f(ξi) ɛ for i = m,..., m and j = 1,..., m v. 4: Set i := argmin z 1. For τ > 0 compute Ŝ = m i= m { {1,..., d} : i, > τ}. y i=vz 5: For each Ŝ, run (P) as defined in Section 4 using ( i,) m i= m, τ and some smoothing parameter γ 0, to obtain φ. 6: For each Ŝ, set φ est, to be the piece-wise integra of φ as epained in Section 4. Theorem 1. There eist constants C, C 1 > 0 such that if m (1/δ), m v C 1 k og d, 0 < ɛ < D m v and τ = CɛkB2 then with high probabiity, Ŝ = S and for any γ 0 the estimate φ est, returned by Agorithm 1 satisfies for each S: φ est, φ L [ 1,1] [59(1 + γ)] CɛkB mv 64m 4 φ (5) L [ 1,1]. (3.6) Reca that k, B 2, D, δ are our probem parameters introduced in Section 2, whie ɛ is the step size parameter from (3.4). We see that with O(k og d) point queries of f and with ɛ < D m v, the active set is recovered eacty. The error bound in (3.6) hods for a such choices of ɛ. It is a sum of two terms in which the first one arises during the estimation of f during the CS stage. The second error term is the interpoation error bound for interpoating φ from its sampes in the noise-free setting. We note that our point queries ie in [ (1 + (ɛ/ m v )), (1 + (ɛ/ m v ))] d. For the stated condition on ɛ in Theorem 1 we have ɛ/ m v < D which can be made arbitrariy cose to zero by choosing an appropriatey sma ɛ. Hence we sampe from ony a sma enargement of [ 1, 1] d. 4

5 4 Anayzing the agorithm We now describe and anayze in more detai the individua stages of Agorithm 1. We first anayze Steps 1-4 which constitute the compressive sensing (CS) based recovery stage. Net, we anayze Step 5 where we aso introduce our conve quadratic program. Lasty, we anayze Step 6 where we derive our fina estimate φ est,. Compressive sensing-based recovery stage. This stage of Agorithm 1 invoves soving a sequence of inear programs for recovering estimates of i = [φ 1(i/m )... φ d (i/m )] for i = m,..., m. We note that the measurements y i are noisy inear measurements of i with the noise being arbitrary and bounded. For such a noise mode, it is known that 1 minimization resuts in robust recovery of the sparse signa [18]. Using this resut in our setting aows us to quantify the recovery error i i 2 as specified in Lemma 1. Lemma 1. There eist constants c 3 1 and C, c 1 > 0 such that for m v satisfying c 3k og d < m v < d/(og 6) 2 we have with probabiity at east 1 e c 1 mv e m vd that i satisfies i i 2 CɛkB 2 for a i = m,..., m. Furthermore, given that this hods and m 1/δ is satisfied we then have for any ɛ < D m v that the choice τ = CɛkB2 impies that Ŝ = S. Thus upon using 1 minimization based decoding at 2m + 1 points, we recover robust estimates i to i which immediatey gives us estimates φ (i/m ) = i, of φ (i/m ) for i = m,..., m and = 1,..., d. In order to recover the active set S, we first note that the spacing between consecutive sampes in X is 1/m. Therefore the condition m 1/δ impies on account of Assumption 3 that the sampe spacing is fine enough to ensure that for each S, there eists a sampe i for which φ (i/m ) D hods. The stated choice of the step size ɛ essentiay guarantees / S, i that φ (i/m ) ies within a sufficienty sma neighborhood of the origin in turn enabing detection of the active set. Therefore after this stage of Agorithm 1, we have at hand: the active set S aong with the estimates: ( φ (i/m )) m i=m for each S. Furthermore, it is easy to see that φ (i/m ) φ (i/m ) τ = CɛkB2, S, i. Robust estimation via cubic spines. Our aim now is to recover a smooth, robust estimate to φ by using the noisy sampes ( φ (i/m )) m i=m. Note that the noise here is arbitrary and bounded by τ = CɛkB2. To this end we choose to use cubic spines as our estimates, which are essentiay piecewise cubic poynomias that are C 2 smooth [19]. There is a considerabe amount of iterature in the statistics community devoted to the probem of estimating univariate functions from noisy sampes via cubic spines (cf., [20, 21, 22, 23]), abeit under the setting of random noise. Cubic spines have aso been studied etensivey in the approimation theoretic setting for interpoating sampes (cf., [19, 24, 25]). We introduce our soution to this probem in a more genera setting. Consider a smooth function g : [t 1, t 2 ] R and a uniform mesh 5 : : t 1 = 0 < 1 < < n 1 < n = t 2 with i i 1 = h. We have at hand noisy sampes: ĝ i = g( i ) + e i, with noise e i being arbitrary and bounded: e i τ. In the noiseess scenario, the probem woud be an interpoation one for which a popuar cass of cubic spines are the not-a-knot cubic spines [24]. These achieve optima O(h 4 ) error rates for C 4 smooth g without using any higher order information about g as boundary conditions. Let H 2 [t 1, t 2 ] denote the space of cubic spines defined on [t 1, t 2 ] w.r.t. We then propose finding the cubic spine estimate as a soution of the foowing conve optimization probem (in the 4n coefficients of the n cubic poynomias) for some parameter γ 0: t2 min L () 2 d L H 2 [t 1,t 2] t 1 (P) s.t. ĝ i γτ L( i ) ĝ i + γτ; i = 0,..., n, L ( 1 ) = L ( + 1 ), L ( n 1 ) = L ( + n 1 ). 5 We consider uniform meshes for carity of eposition. The resuts in this section can be easiy generaized to non-uniform meshes. (4.1) (4.2) (4.3) 5

6 Note that (P) is a conve QP with inear constraints. The objective function can be verified to be a positive definite quadratic form in the spine coefficients 6. Specificay, the objective measures the tota curvature of a feasibe cubic spine in [t 1, t 2 ]. Each of the constraints (4.2)-(4.3) aong with the impicit continuity constraints of L (p) ; p = 0, 1, 2 at the interior points of, are inear equaities/inequaities in the coefficients of the piecewise cubic poynomias. (4.3) refers to the nota-knot boundary conditions [24] which are aso inear equaities in the spine coefficients. These conditions impy that L is continuous 7 at the knots 1, n 1. Thus, (P) searches amongst the space of a not-a-knot cubic spines such that L( i ) ies within a ±γτ interva of ĝ i, and returns the smoothest soution, i.e., the one with the east tota curvature. The parameter γ 0, contros the degree of smoothness of the soution. Ceary, γ = 0 impies interpoating the noisy sampes (ĝ i ) n i=0. As γ increases, the search interva: [ĝ i γτ, ĝ i + γτ] becomes arger for a i, eading to smoother feasibe cubic spines. The foowing theorem formay describes the estimation properties of (P) and is aso our second main resut. Theorem 2. For g C 4 [t 1, t 2 ] et L : [t 1, t 2 ] R be a soution of (P) for some parameter γ 0. We then have that [ ] 118(1 + γ) L g τ h4 g (4). (4.4) We show in the appendi that if t 2 t 1 (L ()) 2 d > 0, then L is unique. Note that the error bound (4.4) is a sum of two terms. The first term is proportiona to the eterna noise bound: τ, indicating that the soution is robust to noise. The second term is the error that woud arise even if perturbation was absent i.e. τ = 0. Intuitivey, if γτ is arge enough, then we woud epect the soution returned by (P) to be a ine. Indeed, a arger vaue of γτ woud impy a arger search interva in (4.2), which if sufficienty arge, woud aow a ine (that has zero curvature) to ie in the feasibe region. More formay, we show in the appendi, sufficient conditions: τ = Ω( n1/2 g γ 1 ), γ > 1, which if satisfied, impy that the soution returned by (P) is a ine. This indicates that if either n is sma or g has sma curvature, then moderatey arge vaues of τ and/or γ wi cause the soution returned by (P) to be a ine. If an estimate of g is avaiabe, then one coud for instance, use the upper bound 1 + O(n 1/2 g /τ) to restrict the range of vaues of γ within which (P) is used. Theorem 2 has the foowing Coroary for estimation of C 4 smooth φ in the interva [ 1, 1]. The proof simpy invoves repacing: g with φ, n + 1 with 2m + 1, h with 1/m and τ with CɛkB2. As the perturbation τ is directy proportiona to the step size ɛ, we show in the appendi that if additionay ɛ = Ω( m m v φ γ 1 ), γ > 1, hods, then the corresponding estimate φ wi be a ine. { Coroary 1. Let (P) be empoyed for each S using noisy sampes φ (i/m ), and i= m with step size ɛ satisfying 0 < ɛ < D m v. Denoting φ as the corresponding soution returned by (P), we then have for any γ 0 that: [ ] φ 59(1 + γ) φ CɛkB2 L [ 1,1] mv 64m 4 φ (5) L [ 1,1]. (4.5) The fina estimate. We now derive the fina estimate φ est, of φ for each S. Denote 0 (= 1) < 1 < < 2m 1 < 2m (= 1) as our equispaced set of points on [ 1, 1]. Since φ : [ 1, 1] R returned by (P) is a cubic spine, we have φ () = φ,i() for [ i, i+1 ] where φ,i is a poynomia of degree at most 3. We then define φ est, () := φ,i () + F i for [ i, i+1 ] and i = 0,..., 2m 1. Here φ,i is a antiderivative of φ,i and F i s are constants of integration. Denoting F 0 = F, we have that φ est, is continuous at 1,..., 2m 1 for: F i = φ,0 ( 1 ) + i 1 j=1 ( φ,j ( j+1 ) φ,j ( j )) φ,i ( i ) + F = F i + F ; 1 i 2m 1. Hence by denoting ψ,i ( ) := φ,i ( ) + F i we obtain φ est,( ) = ψ ( ) + F where ψ () = ψ,i () for 6 Shown in the appendi. 7 f( ) = im h 0 f( + h) and f( + ) = im h 0 + f( + h) denote eft,right hand imits respectivey. } m 6

7 [ i, i+1 ]. Now on account of Assumption 2, we require φ est, to aso be centered impying F = ψ ()d. Hence we output our fina estimate of φ to be: φ est, () := ψ () ψ ()d; [ 1, 1]. (4.6) Since φ est, is by construction continuous in [ 1, 1], is a piecewise combination of poynomias of degree at most 4, and since φ est, is a cubic spine, φ est, is a spine function of order 4. Lasty, we show in the proof of Theorem 1 that φ est, φ L [ 1,1] 3 φ φ L [ 1,1] hods. Using Coroary 1, this provides us with the error bounds stated in Theorem 1. 5 Impact of noise on performance of our agorithm Our third main contribution invoves anayzing the more reaistic scenario, when the point queries are corrupted with additive eterna noise z. Thus querying f in Step 2 of Agorithm 1 resuts in noisy vaues: f(ξ i ) + z i and f(ξ i + ɛv j ) + z i,j respectivey. This changes (3.5) to the noisy inear system: y i = V i + n i + z i where z i,j = (z i,j z i )/ɛ for i = m,..., m and j = 1,..., m v. Notice that eterna noise gets scaed by (1/ɛ), whie n i,j scaes ineary with ɛ. Arbitrary bounded noise. In this mode, the eterna noise is arbitrary but bounded, so that z i, z i,j < κ; i, j. It can be verified aong the ines of the proof of Lemma 1 that: ni + z i 2 ( 2κ mv ɛ + ɛkb2 2m v ). Observe that unike the noiseess setting, ɛ cannot be made arbitrariy cose to 0, as it woud bow up the impact of the eterna noise. The foowing theorem shows that if κ is sma reative to D 2 < φ () 2, I, S, then 8 there eists an interva for choosing ɛ, within which Agorithm 1 recovers eacty the active set S. This condition has the natura interpretation that if the signa-to- eterna noise ratio in I is sufficienty arge, then S can be detected eacty. Theorem 3. There eist constants C, C 1 > 0 such that if κ < D 2 /(16C 2 kb 2 ), m (1/δ), and m v C 1 k og d hod, then for any ɛ D m v 2 [1 A, 1 + A] where A := 1 (16C 2 kb 2 κ)/d 2 and τ = ( 2κ m v ɛ + ɛkb2 2m v ), we have in Agorithm 1, with high probabiity, that Ŝ = S and for any γ 0, for each S: ( 4C mv κ φ est, φ L [ 1,1] [59(1 + γ)] ɛ + CɛkB ) mv 64m 4 φ (5) L [ 1,1]. (5.1) Stochastic noise. In this mode, the eterna noise is assumed to be i.i.d. Gaussian, so that z i, z i,j N (0, σ2 ); i.i.d. i, j. In this setting we consider resamping f at the query point N times and then averaging the noisy sampes, in order to reduce σ. Given this, we now have that z i, z σ2 i,j N (0, N ); i.i.d. i, j. Using standard tai-bounds for Gaussians, we can show that for any κ > 0 if N is chosen arge enough then: z i,j = z i z i,j 2κ; i, j with high probabiity. Hence the eterna noise z i,j woud be bounded with high probabiity and the anaysis for Theorem 3 can be used in a straightforward manner. Of course, an advantage that we have in this setting is that κ can be chosen to be arbitrariy cose to zero by choosing a correspondingy arge vaue of N. We state a this formay in the form of the foowing theorem. Theorem 4. There eist constants C, C 1 > 0 such that for κ < D 2 /(16C 2 kb 2 ), m ( (1/δ), and 2σ ) m v C 1 k og d, if we re-sampe each query in Step 2 of Agorithm 1: N > σ2 κ og 2 κp X V times for 0 < p < 1, and average the vaues, then for any ɛ D m v 2 [1 A, 1 + A] where A := 1 (16C2 kb 2 κ)/d 2 and τ = ( 2κ m v ɛ + ɛkb2 2m v ), we have in Agorithm 1, with probabiity at east 1 p o(1), that Ŝ = S and for any γ 0, for each S: ( 4C mv κ φ est, φ L [ 1,1] [59(1 + γ)] + CɛkB ) ɛ mv 64m 4 φ (5) L [ 1,1]. (5.2) 8 I is the critica interva defined in Assumption 3 for detecting S. 7

8 Note that we query f now N X ( V + 1) times. Aso, X = (2m + 1) = Θ(1), and κ = O(k 1 ), as D, C, B 2, δ are constants. Hence the choice V = O(k og d) gives us N = O(k 2 og(p 1 k 2 og d)) and eads to an overa query compeity of: O(k 3 og d og(p 1 k 2 og d)) when the sampes are corrupted with additive Gaussian noise. Choosing p = O(d c ) for any constant c > 0 gives us a sampe compeity of O(k 3 (og d) 2 ), and ensures that the resut hods with high probabiity. The o(1) term goes to zero eponentiay fast as d. Simuation resuts. We now provide simuation resuts on synthetic data to support our theoretica findings. We consider the noisy setting with the point queries being corrupted with Gaussian noise. For d = 1000, k = 4 and S = {2, 105, 424, 782}, consider f : R d R where f = φ 2 ( 2 ) + φ 105 ( 105 ) + φ 424 ( 424 ) + φ 782 ( 782 ) with: φ 2 () = sin(π), φ 105 () = ep( 2), φ 424 () = (1/3) cos 3 (π) , φ 782 () = We choose δ = 0.3, D = 0.2 which can be verified as vaid parameters for the above φ s. Furthermore, we choose m = 2/δ = 7 and m v = 2k og d = 56 to satisfy the conditions of Theorem 4. Net, we choose constants D C = 0.2, B 2 = 35 and κ = C 2 kb 2 = as required by Theorem 4. For the choice ɛ = D m v 2 = , we then query f at (2m + 1)(m v + 1) = 855 points. The function vaues are corrupted with Gaussian noise: N (0, σ 2 /N) for σ = 0.01 and N = 100. This is equivaent to resamping and averaging the points queries N times. Importanty the sufficient condition on N, as stated in Theorem 4 is σ2 κ og( 2σ X V 2 κp ) = 6974 for p = 0.1. Thus we consider a significanty undersamped regime. Lasty we seect the threshod τ = ( ) 2κ m v ɛ + ɛkb2 2m v = as stated by Theorem 4, and empoy Agorithm 1 for different vaues of the smoothing parameter γ φ2, φest, φ105, φest, φ424, φest, φ782, φest, (a) Estimates of φ (b) Estimates of φ (c) Estimates of φ (d) Estimates of φ 782 Figure 2: Estimates φ est, of φ (back) for: γ = 0.3 (red), γ = 1 (bue) and γ = 5 (green). The resuts are shown in Figure 2. Over 10 independent runs of the agorithm we observed that S was recovered eacty each time. Furthermore we see from Figure 2 that the recovery is quite accurate for γ = 0.3. For γ = 1 we notice that the search interva γτ = becomes arge enough so as to cause the estimates φ est,424, φ est,782 to become reativey smoother. For γ = 5, the search interva γτ = becomes wide enough for a ine to fit in the feasibe region for φ 424, φ 782. This resuts in φ est,424, φ est,782 to be quadratic functions. In the case of φ 2, φ 105, the search interva is not sufficienty wide enough for a ine to ie in the feasibe region, even for γ = 5. However we notice that the estimates φ est,2, φ est,105 become reativey smoother as epected. 6 Concusion We proposed an efficient samping scheme for earning SPAMs. In particuar, we showed that with ony a few queries, we can derive uniform approimations to each underying univariate function of the SPAM. A crucia component of our approach is a nove conve QP for robust estimation of univariate functions via cubic spines, from sampes corrupted with arbitrary bounded noise. Lasty, we showed how our agorithm can hande noisy point queries for both (i) arbitrary bounded and (ii) i.i.d. Gaussian noise modes. An important direction for future work woud be to determine the optimaity of our samping bounds by deriving corresponding ower bounds on the sampe compeity. Acknowedgments. This research was supported in part by SNSF grant and a Microsoft Research Facuty Feowship. 8

9 References [1] Th. Muer-Gronbach and K. Ritter. Minima errors for strong and weak approimation of stochastic differentia equations. Monte Caro and Quasi-Monte Caro Methods, pages 53 82, [2] M.H. Maathuis, M. Kaisch, and P. Bühmann. Estimating high-dimensiona intervention effects from observationa data. The Annas of Statistics, 37(6A): , [3] M.J. Wainwright. Information-theoretic imits on sparsity recovery in the high-dimensiona and noisy setting. Information Theory, IEEE Transactions on, 55(12): , [4] J.F. Traub, G.W. Wasikowski, and H. Wozniakowski. Information-Based Compeity. Academic Press, New York, [5] R. DeVore, G. Petrova, and P. Wojtaszczyk. Approimation of functions of few variabes in high dimensions. Constr. Appro., 33: , [6] A. Cohen, I. Daubechies, R.A. DeVore, G. Kerkyacharian, and D. Picard. Capturing ridge functions in high dimensions from point queries. Constr. Appro., pages 1 19, [7] H. Tyagi and V. Cevher. Active earning of muti-inde function modes. Advances in Neura Information Processing Systems 25, pages , [8] M. Fornasier, K. Schnass, and J. Vybíra. Learning functions of few arbitrary inear parameters in high dimensions. Foundations of Computationa Mathematics, 12(2): , [9] Y. Lin and H.H. Zhang. Component seection and smoothing in mutivariate nonparametric regression. The Annas of Statistics, 34(5): , [10] M. Yuan. Nonnegative garrote component seection in functiona anova modes. In AISTATS, voume 2, pages , [11] G. Raskutti, M.J. Wainwright, and B. Yu. Minima-optima rates for sparse additive modes over kerne casses via conve programming. J. Mach. Learn. Res., 13(1): , [12] V. Kotchinskii and M. Yuan. Sparse recovery in arge ensembes of kerne machines. In COLT, pages , [13] P. Ravikumar, J. Lafferty, H. Liu, and L. Wasserman. Sparse additive modes. Journa of the Roya Statistica Society: Series B (Statistica Methodoogy), 71(5): , [14] L. Meier, S. Van De Geer, and P. Bühmann. High-dimensiona additive modeing. The Annas of Statistics, 37(6B): , [15] V. Kotchinskii and M. Yuan. Sparsity in mutipe kerne earning. The Annas of Statistics, 38(6): , [16] E.J. Candès, J.K. Romberg, and T. Tao. Stabe signa recovery from incompete and inaccurate measurements. Communications on Pure and Appied Mathematics, 59(8): , [17] D.L. Donoho. Compressed sensing. IEEE Transactions on Information Theory, 52(4): , [18] P. Wojtaszczyk. 1 minimization with noisy data. SIAM Journa on Numerica Anaysis, 50(2): , [19] J.H. Ahberg, E.N. Nison, and J.L. Wash. The theory of spines and their appications. Academic Press (New York), [20] I.J. Schoenberg. Spine functions and the probem of graduation. Proceedings of the Nationa Academy of Sciences, 52(4): , [21] C.M. Reinsch. Smoothing by spine functions. Numer. Math, 10: , [22] G. Wahba. Smoothing noisy data with spine functions. Numerische Mathematik, 24(5): , [23] P. Craven and G. Wahba. Smoothing noisy data with spine functions. Numerische Mathematik, 31(4): , [24] C. de Boor. A practica guide to spines. Springer Verag (New York), [25] C.A. Ha and W.W. Meyer. Optima error bounds for cubic spine interpoation. Journa of Approimation Theory, 16(2): ,

Statistical Learning Theory: A Primer

Statistical Learning Theory: A Primer Internationa Journa of Computer Vision 38(), 9 3, 2000 c 2000 uwer Academic Pubishers. Manufactured in The Netherands. Statistica Learning Theory: A Primer THEODOROS EVGENIOU, MASSIMILIANO PONTIL AND TOMASO

More information

Multilayer Kerceptron

Multilayer Kerceptron Mutiayer Kerceptron Zotán Szabó, András Lőrincz Department of Information Systems, Facuty of Informatics Eötvös Loránd University Pázmány Péter sétány 1/C H-1117, Budapest, Hungary e-mai: szzoi@csetehu,

More information

A Brief Introduction to Markov Chains and Hidden Markov Models

A Brief Introduction to Markov Chains and Hidden Markov Models A Brief Introduction to Markov Chains and Hidden Markov Modes Aen B MacKenzie Notes for December 1, 3, &8, 2015 Discrete-Time Markov Chains You may reca that when we first introduced random processes,

More information

A. Distribution of the test statistic

A. Distribution of the test statistic A. Distribution of the test statistic In the sequentia test, we first compute the test statistic from a mini-batch of size m. If a decision cannot be made with this statistic, we keep increasing the mini-batch

More information

A proposed nonparametric mixture density estimation using B-spline functions

A proposed nonparametric mixture density estimation using B-spline functions A proposed nonparametric mixture density estimation using B-spine functions Atizez Hadrich a,b, Mourad Zribi a, Afif Masmoudi b a Laboratoire d Informatique Signa et Image de a Côte d Opae (LISIC-EA 4491),

More information

Statistical Learning Theory: a Primer

Statistical Learning Theory: a Primer ??,??, 1 6 (??) c?? Kuwer Academic Pubishers, Boston. Manufactured in The Netherands. Statistica Learning Theory: a Primer THEODOROS EVGENIOU AND MASSIMILIANO PONTIL Center for Bioogica and Computationa

More information

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones

ASummaryofGaussianProcesses Coryn A.L. Bailer-Jones ASummaryofGaussianProcesses Coryn A.L. Baier-Jones Cavendish Laboratory University of Cambridge caj@mrao.cam.ac.uk Introduction A genera prediction probem can be posed as foows. We consider that the variabe

More information

SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS

SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS ISEE 1 SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS By Yingying Fan and Jinchi Lv University of Southern Caifornia This Suppementary Materia

More information

4 Separation of Variables

4 Separation of Variables 4 Separation of Variabes In this chapter we describe a cassica technique for constructing forma soutions to inear boundary vaue probems. The soution of three cassica (paraboic, hyperboic and eiptic) PDE

More information

WAVELET LINEAR ESTIMATION FOR DERIVATIVES OF A DENSITY FROM OBSERVATIONS OF MIXTURES WITH VARYING MIXING PROPORTIONS. B. L. S.

WAVELET LINEAR ESTIMATION FOR DERIVATIVES OF A DENSITY FROM OBSERVATIONS OF MIXTURES WITH VARYING MIXING PROPORTIONS. B. L. S. Indian J. Pure App. Math., 41(1): 275-291, February 2010 c Indian Nationa Science Academy WAVELET LINEAR ESTIMATION FOR DERIVATIVES OF A DENSITY FROM OBSERVATIONS OF MIXTURES WITH VARYING MIXING PROPORTIONS

More information

Wavelet shrinkage estimators of Hilbert transform

Wavelet shrinkage estimators of Hilbert transform Journa of Approximation Theory 163 (2011) 652 662 www.esevier.com/ocate/jat Fu ength artice Waveet shrinkage estimators of Hibert transform Di-Rong Chen, Yao Zhao Department of Mathematics, LMIB, Beijing

More information

Week 6 Lectures, Math 6451, Tanveer

Week 6 Lectures, Math 6451, Tanveer Fourier Series Week 6 Lectures, Math 645, Tanveer In the context of separation of variabe to find soutions of PDEs, we encountered or and in other cases f(x = f(x = a 0 + f(x = a 0 + b n sin nπx { a n

More information

arxiv: v1 [math.ca] 6 Mar 2017

arxiv: v1 [math.ca] 6 Mar 2017 Indefinite Integras of Spherica Besse Functions MIT-CTP/487 arxiv:703.0648v [math.ca] 6 Mar 07 Joyon K. Boomfied,, Stephen H. P. Face,, and Zander Moss, Center for Theoretica Physics, Laboratory for Nucear

More information

Structural Control of Probabilistic Boolean Networks and Its Application to Design of Real-Time Pricing Systems

Structural Control of Probabilistic Boolean Networks and Its Application to Design of Real-Time Pricing Systems Preprints of the 9th Word Congress The Internationa Federation of Automatic Contro Structura Contro of Probabiistic Booean Networks and Its Appication to Design of Rea-Time Pricing Systems Koichi Kobayashi

More information

arxiv: v1 [cs.lg] 31 Oct 2017

arxiv: v1 [cs.lg] 31 Oct 2017 ACCELERATED SPARSE SUBSPACE CLUSTERING Abofaz Hashemi and Haris Vikao Department of Eectrica and Computer Engineering, University of Texas at Austin, Austin, TX, USA arxiv:7.26v [cs.lg] 3 Oct 27 ABSTRACT

More information

Mat 1501 lecture notes, penultimate installment

Mat 1501 lecture notes, penultimate installment Mat 1501 ecture notes, penutimate instament 1. bounded variation: functions of a singe variabe optiona) I beieve that we wi not actuay use the materia in this section the point is mainy to motivate the

More information

CS229 Lecture notes. Andrew Ng

CS229 Lecture notes. Andrew Ng CS229 Lecture notes Andrew Ng Part IX The EM agorithm In the previous set of notes, we taked about the EM agorithm as appied to fitting a mixture of Gaussians. In this set of notes, we give a broader view

More information

Separation of Variables and a Spherical Shell with Surface Charge

Separation of Variables and a Spherical Shell with Surface Charge Separation of Variabes and a Spherica She with Surface Charge In cass we worked out the eectrostatic potentia due to a spherica she of radius R with a surface charge density σθ = σ cos θ. This cacuation

More information

Smoothness equivalence properties of univariate subdivision schemes and their projection analogues

Smoothness equivalence properties of univariate subdivision schemes and their projection analogues Numerische Mathematik manuscript No. (wi be inserted by the editor) Smoothness equivaence properties of univariate subdivision schemes and their projection anaogues Phiipp Grohs TU Graz Institute of Geometry

More information

Melodic contour estimation with B-spline models using a MDL criterion

Melodic contour estimation with B-spline models using a MDL criterion Meodic contour estimation with B-spine modes using a MDL criterion Damien Loive, Ney Barbot, Oivier Boeffard IRISA / University of Rennes 1 - ENSSAT 6 rue de Kerampont, B.P. 80518, F-305 Lannion Cedex

More information

Explicit overall risk minimization transductive bound

Explicit overall risk minimization transductive bound 1 Expicit overa risk minimization transductive bound Sergio Decherchi, Paoo Gastado, Sandro Ridea, Rodofo Zunino Dept. of Biophysica and Eectronic Engineering (DIBE), Genoa University Via Opera Pia 11a,

More information

Probabilistic Graphical Models

Probabilistic Graphical Models Schoo of Computer Science Probabiistic Graphica Modes Gaussian graphica modes and Ising modes: modeing networks Eric Xing Lecture 0, February 0, 07 Reading: See cass website Eric Xing @ CMU, 005-07 Network

More information

Asynchronous Control for Coupled Markov Decision Systems

Asynchronous Control for Coupled Markov Decision Systems INFORMATION THEORY WORKSHOP (ITW) 22 Asynchronous Contro for Couped Marov Decision Systems Michae J. Neey University of Southern Caifornia Abstract This paper considers optima contro for a coection of

More information

$, (2.1) n="# #. (2.2)

$, (2.1) n=# #. (2.2) Chapter. Eectrostatic II Notes: Most of the materia presented in this chapter is taken from Jackson, Chap.,, and 4, and Di Bartoo, Chap... Mathematica Considerations.. The Fourier series and the Fourier

More information

Moreau-Yosida Regularization for Grouped Tree Structure Learning

Moreau-Yosida Regularization for Grouped Tree Structure Learning Moreau-Yosida Reguarization for Grouped Tree Structure Learning Jun Liu Computer Science and Engineering Arizona State University J.Liu@asu.edu Jieping Ye Computer Science and Engineering Arizona State

More information

Tracking Control of Multiple Mobile Robots

Tracking Control of Multiple Mobile Robots Proceedings of the 2001 IEEE Internationa Conference on Robotics & Automation Seou, Korea May 21-26, 2001 Tracking Contro of Mutipe Mobie Robots A Case Study of Inter-Robot Coision-Free Probem Jurachart

More information

A Comparison Study of the Test for Right Censored and Grouped Data

A Comparison Study of the Test for Right Censored and Grouped Data Communications for Statistica Appications and Methods 2015, Vo. 22, No. 4, 313 320 DOI: http://dx.doi.org/10.5351/csam.2015.22.4.313 Print ISSN 2287-7843 / Onine ISSN 2383-4757 A Comparison Study of the

More information

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with?

Bayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with? Bayesian Learning A powerfu and growing approach in machine earning We use it in our own decision making a the time You hear a which which coud equay be Thanks or Tanks, which woud you go with? Combine

More information

Online Load Balancing on Related Machines

Online Load Balancing on Related Machines Onine Load Baancing on Reated Machines ABSTRACT Sungjin Im University of Caifornia at Merced Merced, CA, USA sim3@ucmerced.edu Debmaya Panigrahi Duke University Durham, NC, USA debmaya@cs.duke.edu We give

More information

A Fundamental Storage-Communication Tradeoff in Distributed Computing with Straggling Nodes

A Fundamental Storage-Communication Tradeoff in Distributed Computing with Straggling Nodes A Fundamenta Storage-Communication Tradeoff in Distributed Computing with Stragging odes ifa Yan, Michèe Wigger LTCI, Téécom ParisTech 75013 Paris, France Emai: {qifa.yan, michee.wigger} @teecom-paristech.fr

More information

Uniprocessor Feasibility of Sporadic Tasks with Constrained Deadlines is Strongly conp-complete

Uniprocessor Feasibility of Sporadic Tasks with Constrained Deadlines is Strongly conp-complete Uniprocessor Feasibiity of Sporadic Tasks with Constrained Deadines is Strongy conp-compete Pontus Ekberg and Wang Yi Uppsaa University, Sweden Emai: {pontus.ekberg yi}@it.uu.se Abstract Deciding the feasibiity

More information

HILBERT? What is HILBERT? Matlab Implementation of Adaptive 2D BEM. Dirk Praetorius. Features of HILBERT

HILBERT? What is HILBERT? Matlab Implementation of Adaptive 2D BEM. Dirk Praetorius. Features of HILBERT Söerhaus-Workshop 2009 October 16, 2009 What is HILBERT? HILBERT Matab Impementation of Adaptive 2D BEM joint work with M. Aurada, M. Ebner, S. Ferraz-Leite, P. Godenits, M. Karkuik, M. Mayr Hibert Is

More information

A unified framework for Regularization Networks and Support Vector Machines. Theodoros Evgeniou, Massimiliano Pontil, Tomaso Poggio

A unified framework for Regularization Networks and Support Vector Machines. Theodoros Evgeniou, Massimiliano Pontil, Tomaso Poggio MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES A.I. Memo No. 1654 March23, 1999

More information

AALBORG UNIVERSITY. The distribution of communication cost for a mobile service scenario. Jesper Møller and Man Lung Yiu. R June 2009

AALBORG UNIVERSITY. The distribution of communication cost for a mobile service scenario. Jesper Møller and Man Lung Yiu. R June 2009 AALBORG UNIVERSITY The distribution of communication cost for a mobie service scenario by Jesper Møer and Man Lung Yiu R-29-11 June 29 Department of Mathematica Sciences Aaborg University Fredrik Bajers

More information

NOISE-INDUCED STABILIZATION OF STOCHASTIC DIFFERENTIAL EQUATIONS

NOISE-INDUCED STABILIZATION OF STOCHASTIC DIFFERENTIAL EQUATIONS NOISE-INDUCED STABILIZATION OF STOCHASTIC DIFFERENTIAL EQUATIONS TONY ALLEN, EMILY GEBHARDT, AND ADAM KLUBALL 3 ADVISOR: DR. TIFFANY KOLBA 4 Abstract. The phenomenon of noise-induced stabiization occurs

More information

From Margins to Probabilities in Multiclass Learning Problems

From Margins to Probabilities in Multiclass Learning Problems From Margins to Probabiities in Muticass Learning Probems Andrea Passerini and Massimiiano Ponti 2 and Paoo Frasconi 3 Abstract. We study the probem of muticass cassification within the framework of error

More information

Proceedings of the 2012 Winter Simulation Conference C. Laroque, J. Himmelspach, R. Pasupathy, O. Rose, and A. M. Uhrmacher, eds.

Proceedings of the 2012 Winter Simulation Conference C. Laroque, J. Himmelspach, R. Pasupathy, O. Rose, and A. M. Uhrmacher, eds. Proceedings of the 2012 Winter Simuation Conference C. aroque, J. Himmespach, R. Pasupathy, O. Rose, and A. M. Uhrmacher, eds. TIGHT BOUNDS FOR AMERICAN OPTIONS VIA MUTIEVE MONTE CARO Denis Beomestny Duisburg-Essen

More information

ORTHOGONAL MULTI-WAVELETS FROM MATRIX FACTORIZATION

ORTHOGONAL MULTI-WAVELETS FROM MATRIX FACTORIZATION J. Korean Math. Soc. 46 2009, No. 2, pp. 281 294 ORHOGONAL MLI-WAVELES FROM MARIX FACORIZAION Hongying Xiao Abstract. Accuracy of the scaing function is very crucia in waveet theory, or correspondingy,

More information

Cryptanalysis of PKP: A New Approach

Cryptanalysis of PKP: A New Approach Cryptanaysis of PKP: A New Approach Éiane Jaumes and Antoine Joux DCSSI 18, rue du Dr. Zamenhoff F-92131 Issy-es-Mx Cedex France eiane.jaumes@wanadoo.fr Antoine.Joux@ens.fr Abstract. Quite recenty, in

More information

DIFFERENCE-OF-CONVEX LEARNING: DIRECTIONAL STATIONARITY, OPTIMALITY, AND SPARSITY

DIFFERENCE-OF-CONVEX LEARNING: DIRECTIONAL STATIONARITY, OPTIMALITY, AND SPARSITY SIAM J. OPTIM. Vo. 7, No. 3, pp. 1637 1665 c 017 Society for Industria and Appied Mathematics DIFFERENCE-OF-CONVEX LEARNING: DIRECTIONAL STATIONARITY, OPTIMALITY, AND SPARSITY MIJU AHN, JONG-SHI PANG,

More information

Statistical Inference, Econometric Analysis and Matrix Algebra

Statistical Inference, Econometric Analysis and Matrix Algebra Statistica Inference, Econometric Anaysis and Matrix Agebra Bernhard Schipp Water Krämer Editors Statistica Inference, Econometric Anaysis and Matrix Agebra Festschrift in Honour of Götz Trenker Physica-Verag

More information

NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION

NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION Hsiao-Chang Chen Dept. of Systems Engineering University of Pennsyvania Phiadephia, PA 904-635, U.S.A. Chun-Hung Chen

More information

(This is a sample cover image for this issue. The actual cover is not yet available at this time.)

(This is a sample cover image for this issue. The actual cover is not yet available at this time.) (This is a sampe cover image for this issue The actua cover is not yet avaiabe at this time) This artice appeared in a journa pubished by Esevier The attached copy is furnished to the author for interna

More information

The Group Structure on a Smooth Tropical Cubic

The Group Structure on a Smooth Tropical Cubic The Group Structure on a Smooth Tropica Cubic Ethan Lake Apri 20, 2015 Abstract Just as in in cassica agebraic geometry, it is possibe to define a group aw on a smooth tropica cubic curve. In this note,

More information

C. Fourier Sine Series Overview

C. Fourier Sine Series Overview 12 PHILIP D. LOEWEN C. Fourier Sine Series Overview Let some constant > be given. The symboic form of the FSS Eigenvaue probem combines an ordinary differentia equation (ODE) on the interva (, ) with a

More information

arxiv: v1 [math.fa] 23 Aug 2018

arxiv: v1 [math.fa] 23 Aug 2018 An Exact Upper Bound on the L p Lebesgue Constant and The -Rényi Entropy Power Inequaity for Integer Vaued Random Variabes arxiv:808.0773v [math.fa] 3 Aug 08 Peng Xu, Mokshay Madiman, James Mebourne Abstract

More information

Absolute Value Preconditioning for Symmetric Indefinite Linear Systems

Absolute Value Preconditioning for Symmetric Indefinite Linear Systems MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.mer.com Absoute Vaue Preconditioning for Symmetric Indefinite Linear Systems Vecharynski, E.; Knyazev, A.V. TR2013-016 March 2013 Abstract We introduce

More information

Intuitionistic Fuzzy Optimization Technique for Nash Equilibrium Solution of Multi-objective Bi-Matrix Games

Intuitionistic Fuzzy Optimization Technique for Nash Equilibrium Solution of Multi-objective Bi-Matrix Games Journa of Uncertain Systems Vo.5, No.4, pp.27-285, 20 Onine at: www.jus.org.u Intuitionistic Fuzzy Optimization Technique for Nash Equiibrium Soution of Muti-objective Bi-Matri Games Prasun Kumar Naya,,

More information

Akaike Information Criterion for ANOVA Model with a Simple Order Restriction

Akaike Information Criterion for ANOVA Model with a Simple Order Restriction Akaike Information Criterion for ANOVA Mode with a Simpe Order Restriction Yu Inatsu * Department of Mathematics, Graduate Schoo of Science, Hiroshima University ABSTRACT In this paper, we consider Akaike

More information

Determining The Degree of Generalization Using An Incremental Learning Algorithm

Determining The Degree of Generalization Using An Incremental Learning Algorithm Determining The Degree of Generaization Using An Incrementa Learning Agorithm Pabo Zegers Facutad de Ingeniería, Universidad de os Andes San Caros de Apoquindo 22, Las Condes, Santiago, Chie pzegers@uandes.c

More information

Target Location Estimation in Wireless Sensor Networks Using Binary Data

Target Location Estimation in Wireless Sensor Networks Using Binary Data Target Location stimation in Wireess Sensor Networks Using Binary Data Ruixin Niu and Pramod K. Varshney Department of ectrica ngineering and Computer Science Link Ha Syracuse University Syracuse, NY 344

More information

MARKOV CHAINS AND MARKOV DECISION THEORY. Contents

MARKOV CHAINS AND MARKOV DECISION THEORY. Contents MARKOV CHAINS AND MARKOV DECISION THEORY ARINDRIMA DATTA Abstract. In this paper, we begin with a forma introduction to probabiity and expain the concept of random variabes and stochastic processes. After

More information

Symbolic models for nonlinear control systems using approximate bisimulation

Symbolic models for nonlinear control systems using approximate bisimulation Symboic modes for noninear contro systems using approximate bisimuation Giordano Poa, Antoine Girard and Pauo Tabuada Abstract Contro systems are usuay modeed by differentia equations describing how physica

More information

Nonlinear Gaussian Filtering via Radial Basis Function Approximation

Nonlinear Gaussian Filtering via Radial Basis Function Approximation 51st IEEE Conference on Decision and Contro December 10-13 01 Maui Hawaii USA Noninear Gaussian Fitering via Radia Basis Function Approximation Huazhen Fang Jia Wang and Raymond A de Caafon Abstract This

More information

Bayesian Unscented Kalman Filter for State Estimation of Nonlinear and Non-Gaussian Systems

Bayesian Unscented Kalman Filter for State Estimation of Nonlinear and Non-Gaussian Systems Bayesian Unscented Kaman Fiter for State Estimation of Noninear and Non-aussian Systems Zhong Liu, Shing-Chow Chan, Ho-Chun Wu and iafei Wu Department of Eectrica and Eectronic Engineering, he University

More information

The EM Algorithm applied to determining new limit points of Mahler measures

The EM Algorithm applied to determining new limit points of Mahler measures Contro and Cybernetics vo. 39 (2010) No. 4 The EM Agorithm appied to determining new imit points of Maher measures by Souad E Otmani, Georges Rhin and Jean-Marc Sac-Épée Université Pau Veraine-Metz, LMAM,

More information

Two view learning: SVM-2K, Theory and Practice

Two view learning: SVM-2K, Theory and Practice Two view earning: SVM-2K, Theory and Practice Jason D.R. Farquhar jdrf99r@ecs.soton.ac.uk Hongying Meng hongying@cs.york.ac.uk David R. Hardoon drh@ecs.soton.ac.uk John Shawe-Tayor jst@ecs.soton.ac.uk

More information

STA 216 Project: Spline Approach to Discrete Survival Analysis

STA 216 Project: Spline Approach to Discrete Survival Analysis : Spine Approach to Discrete Surviva Anaysis November 4, 005 1 Introduction Athough continuous surviva anaysis differs much from the discrete surviva anaysis, there is certain ink between the two modeing

More information

An explicit Jordan Decomposition of Companion matrices

An explicit Jordan Decomposition of Companion matrices An expicit Jordan Decomposition of Companion matrices Fermín S V Bazán Departamento de Matemática CFM UFSC 88040-900 Forianópois SC E-mai: fermin@mtmufscbr S Gratton CERFACS 42 Av Gaspard Coriois 31057

More information

Discrete Techniques. Chapter Introduction

Discrete Techniques. Chapter Introduction Chapter 3 Discrete Techniques 3. Introduction In the previous two chapters we introduced Fourier transforms of continuous functions of the periodic and non-periodic (finite energy) type, we as various

More information

Lecture Note 3: Stationary Iterative Methods

Lecture Note 3: Stationary Iterative Methods MATH 5330: Computationa Methods of Linear Agebra Lecture Note 3: Stationary Iterative Methods Xianyi Zeng Department of Mathematica Sciences, UTEP Stationary Iterative Methods The Gaussian eimination (or

More information

On the evaluation of saving-consumption plans

On the evaluation of saving-consumption plans On the evauation of saving-consumption pans Steven Vanduffe Jan Dhaene Marc Goovaerts Juy 13, 2004 Abstract Knowedge of the distribution function of the stochasticay compounded vaue of a series of future

More information

Asymptotic Properties of a Generalized Cross Entropy Optimization Algorithm

Asymptotic Properties of a Generalized Cross Entropy Optimization Algorithm 1 Asymptotic Properties of a Generaized Cross Entropy Optimization Agorithm Zijun Wu, Michae Koonko, Institute for Appied Stochastics and Operations Research, Caustha Technica University Abstract The discrete

More information

Stochastic Variational Inference with Gradient Linearization

Stochastic Variational Inference with Gradient Linearization Stochastic Variationa Inference with Gradient Linearization Suppementa Materia Tobias Pötz * Anne S Wannenwetsch Stefan Roth Department of Computer Science, TU Darmstadt Preface In this suppementa materia,

More information

Control Chart For Monitoring Nonparametric Profiles With Arbitrary Design

Control Chart For Monitoring Nonparametric Profiles With Arbitrary Design Contro Chart For Monitoring Nonparametric Profies With Arbitrary Design Peihua Qiu 1 and Changiang Zou 2 1 Schoo of Statistics, University of Minnesota, USA 2 LPMC and Department of Statistics, Nankai

More information

Primal and dual active-set methods for convex quadratic programming

Primal and dual active-set methods for convex quadratic programming Math. Program., Ser. A 216) 159:469 58 DOI 1.17/s117-15-966-2 FULL LENGTH PAPER Prima and dua active-set methods for convex quadratic programming Anders Forsgren 1 Phiip E. Gi 2 Eizabeth Wong 2 Received:

More information

Appendix for Stochastic Gradient Monomial Gamma Sampler

Appendix for Stochastic Gradient Monomial Gamma Sampler 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 3 3 33 34 35 36 37 38 39 4 4 4 43 44 45 46 47 48 49 5 5 5 53 54 Appendix for Stochastic Gradient Monomia Gamma Samper A The Main Theorem We provide the foowing

More information

Discrete Techniques. Chapter Introduction

Discrete Techniques. Chapter Introduction Chapter 3 Discrete Techniques 3. Introduction In the previous two chapters we introduced Fourier transforms of continuous functions of the periodic and non-periodic (finite energy) type, as we as various

More information

arxiv: v1 [cs.db] 1 Aug 2012

arxiv: v1 [cs.db] 1 Aug 2012 Functiona Mechanism: Regression Anaysis under Differentia Privacy arxiv:208.029v [cs.db] Aug 202 Jun Zhang Zhenjie Zhang 2 Xiaokui Xiao Yin Yang 2 Marianne Winsett 2,3 ABSTRACT Schoo of Computer Engineering

More information

International Journal of Mass Spectrometry

International Journal of Mass Spectrometry Internationa Journa of Mass Spectrometry 280 (2009) 179 183 Contents ists avaiabe at ScienceDirect Internationa Journa of Mass Spectrometry journa homepage: www.esevier.com/ocate/ijms Stark mixing by ion-rydberg

More information

2-loop additive mass renormalization with clover fermions and Symanzik improved gluons

2-loop additive mass renormalization with clover fermions and Symanzik improved gluons 2-oop additive mass renormaization with cover fermions and Symanzik improved guons Apostoos Skouroupathis Department of Physics, University of Cyprus, Nicosia, CY-1678, Cyprus E-mai: php4as01@ucy.ac.cy

More information

A Statistical Framework for Real-time Event Detection in Power Systems

A Statistical Framework for Real-time Event Detection in Power Systems 1 A Statistica Framework for Rea-time Event Detection in Power Systems Noan Uhrich, Tim Christman, Phiip Swisher, and Xichen Jiang Abstract A quickest change detection (QCD) agorithm is appied to the probem

More information

Schedulability Analysis of Deferrable Scheduling Algorithms for Maintaining Real-Time Data Freshness

Schedulability Analysis of Deferrable Scheduling Algorithms for Maintaining Real-Time Data Freshness 1 Scheduabiity Anaysis of Deferrabe Scheduing Agorithms for Maintaining Rea-Time Data Freshness Song Han, Deji Chen, Ming Xiong, Kam-yiu Lam, Aoysius K. Mok, Krithi Ramamritham UT Austin, Emerson Process

More information

Schedulability Analysis of Deferrable Scheduling Algorithms for Maintaining Real-Time Data Freshness

Schedulability Analysis of Deferrable Scheduling Algorithms for Maintaining Real-Time Data Freshness 1 Scheduabiity Anaysis of Deferrabe Scheduing Agorithms for Maintaining Rea- Data Freshness Song Han, Deji Chen, Ming Xiong, Kam-yiu Lam, Aoysius K. Mok, Krithi Ramamritham UT Austin, Emerson Process Management,

More information

Limits on Support Recovery with Probabilistic Models: An Information-Theoretic Framework

Limits on Support Recovery with Probabilistic Models: An Information-Theoretic Framework Limits on Support Recovery with Probabiistic Modes: An Information-Theoretic Framewor Jonathan Scarett and Voan Cevher arxiv:5.744v3 cs.it 3 Aug 6 Abstract The support recovery probem consists of determining

More information

Appendix of the Paper The Role of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Model

Appendix of the Paper The Role of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Model Appendix of the Paper The Roe of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Mode Caio Ameida cameida@fgv.br José Vicente jose.vaentim@bcb.gov.br June 008 1 Introduction In this

More information

4 1-D Boundary Value Problems Heat Equation

4 1-D Boundary Value Problems Heat Equation 4 -D Boundary Vaue Probems Heat Equation The main purpose of this chapter is to study boundary vaue probems for the heat equation on a finite rod a x b. u t (x, t = ku xx (x, t, a < x < b, t > u(x, = ϕ(x

More information

Algorithms to solve massively under-defined systems of multivariate quadratic equations

Algorithms to solve massively under-defined systems of multivariate quadratic equations Agorithms to sove massivey under-defined systems of mutivariate quadratic equations Yasufumi Hashimoto Abstract It is we known that the probem to sove a set of randomy chosen mutivariate quadratic equations

More information

Related Topics Maxwell s equations, electrical eddy field, magnetic field of coils, coil, magnetic flux, induced voltage

Related Topics Maxwell s equations, electrical eddy field, magnetic field of coils, coil, magnetic flux, induced voltage Magnetic induction TEP Reated Topics Maxwe s equations, eectrica eddy fied, magnetic fied of cois, coi, magnetic fux, induced votage Principe A magnetic fied of variabe frequency and varying strength is

More information

Haar Decomposition and Reconstruction Algorithms

Haar Decomposition and Reconstruction Algorithms Jim Lambers MAT 773 Fa Semester 018-19 Lecture 15 and 16 Notes These notes correspond to Sections 4.3 and 4.4 in the text. Haar Decomposition and Reconstruction Agorithms Decomposition Suppose we approximate

More information

VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES

VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES SARAH DAY, JEAN-PHILIPPE LESSARD, AND KONSTANTIN MISCHAIKOW Abstract. One of the most efficient methods for determining the equiibria of a continuous parameterized

More information

Iterative Decoding Performance Bounds for LDPC Codes on Noisy Channels

Iterative Decoding Performance Bounds for LDPC Codes on Noisy Channels Iterative Decoding Performance Bounds for LDPC Codes on Noisy Channes arxiv:cs/060700v1 [cs.it] 6 Ju 006 Chun-Hao Hsu and Achieas Anastasopouos Eectrica Engineering and Computer Science Department University

More information

6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7

6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7 6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17 Soution 7 Probem 1: Generating Random Variabes Each part of this probem requires impementation in MATLAB. For the

More information

Nearly Optimal Constructions of PIR and Batch Codes

Nearly Optimal Constructions of PIR and Batch Codes arxiv:700706v [csit] 5 Jun 07 Neary Optima Constructions of PIR and Batch Codes Hia Asi Technion - Israe Institute of Technoogy Haifa 3000, Israe shea@cstechnionaci Abstract In this work we study two famiies

More information

arxiv: v2 [cs.lg] 4 Sep 2014

arxiv: v2 [cs.lg] 4 Sep 2014 Cassification with Sparse Overapping Groups Nikhi S. Rao Robert D. Nowak Department of Eectrica and Computer Engineering University of Wisconsin-Madison nrao2@wisc.edu nowak@ece.wisc.edu ariv:1402.4512v2

More information

Approximation and Fast Calculation of Non-local Boundary Conditions for the Time-dependent Schrödinger Equation

Approximation and Fast Calculation of Non-local Boundary Conditions for the Time-dependent Schrödinger Equation Approximation and Fast Cacuation of Non-oca Boundary Conditions for the Time-dependent Schrödinger Equation Anton Arnod, Matthias Ehrhardt 2, and Ivan Sofronov 3 Universität Münster, Institut für Numerische

More information

Substructuring Preconditioners for the Bidomain Extracellular Potential Problem

Substructuring Preconditioners for the Bidomain Extracellular Potential Problem Substructuring Preconditioners for the Bidomain Extraceuar Potentia Probem Mico Pennacchio 1 and Vaeria Simoncini 2,1 1 IMATI - CNR, via Ferrata, 1, 27100 Pavia, Itay mico@imaticnrit 2 Dipartimento di

More information

c 2016 Georgios Rovatsos

c 2016 Georgios Rovatsos c 2016 Georgios Rovatsos QUICKEST CHANGE DETECTION WITH APPLICATIONS TO LINE OUTAGE DETECTION BY GEORGIOS ROVATSOS THESIS Submitted in partia fufiment of the requirements for the degree of Master of Science

More information

XSAT of linear CNF formulas

XSAT of linear CNF formulas XSAT of inear CN formuas Bernd R. Schuh Dr. Bernd Schuh, D-50968 Kön, Germany; bernd.schuh@netcoogne.de eywords: compexity, XSAT, exact inear formua, -reguarity, -uniformity, NPcompeteness Abstract. Open

More information

An approximate method for solving the inverse scattering problem with fixed-energy data

An approximate method for solving the inverse scattering problem with fixed-energy data J. Inv. I-Posed Probems, Vo. 7, No. 6, pp. 561 571 (1999) c VSP 1999 An approximate method for soving the inverse scattering probem with fixed-energy data A. G. Ramm and W. Scheid Received May 12, 1999

More information

PHYS 110B - HW #1 Fall 2005, Solutions by David Pace Equations referenced as Eq. # are from Griffiths Problem statements are paraphrased

PHYS 110B - HW #1 Fall 2005, Solutions by David Pace Equations referenced as Eq. # are from Griffiths Problem statements are paraphrased PHYS 110B - HW #1 Fa 2005, Soutions by David Pace Equations referenced as Eq. # are from Griffiths Probem statements are paraphrased [1.] Probem 6.8 from Griffiths A ong cyinder has radius R and a magnetization

More information

Asymptotic Gains of Generalized Selection Combining

Asymptotic Gains of Generalized Selection Combining Asymptotic Gains of Generaized Seection Combining Yao a Z Wang S Pasupathy Dept of ECpE Dept of ECE Dept of ECE Iowa State Univ Iowa State Univ Univ of Toronto, Emai: mayao@iastateedu Emai: zhengdao@iastateedu

More information

Appendix for Stochastic Gradient Monomial Gamma Sampler

Appendix for Stochastic Gradient Monomial Gamma Sampler Appendix for Stochastic Gradient Monomia Gamma Samper A The Main Theorem We provide the foowing theorem to characterize the stationary distribution of the stochastic process with SDEs in (3) Theorem 3

More information

Rate-Distortion Theory of Finite Point Processes

Rate-Distortion Theory of Finite Point Processes Rate-Distortion Theory of Finite Point Processes Günther Koiander, Dominic Schuhmacher, and Franz Hawatsch, Feow, IEEE Abstract We study the compression of data in the case where the usefu information

More information

Inductive Bias: How to generalize on novel data. CS Inductive Bias 1

Inductive Bias: How to generalize on novel data. CS Inductive Bias 1 Inductive Bias: How to generaize on nove data CS 478 - Inductive Bias 1 Overfitting Noise vs. Exceptions CS 478 - Inductive Bias 2 Non-Linear Tasks Linear Regression wi not generaize we to the task beow

More information

An Algorithm for Pruning Redundant Modules in Min-Max Modular Network

An Algorithm for Pruning Redundant Modules in Min-Max Modular Network An Agorithm for Pruning Redundant Modues in Min-Max Moduar Network Hui-Cheng Lian and Bao-Liang Lu Department of Computer Science and Engineering, Shanghai Jiao Tong University 1954 Hua Shan Rd., Shanghai

More information

More Scattering: the Partial Wave Expansion

More Scattering: the Partial Wave Expansion More Scattering: the Partia Wave Expansion Michae Fower /7/8 Pane Waves and Partia Waves We are considering the soution to Schrödinger s equation for scattering of an incoming pane wave in the z-direction

More information

arxiv: v3 [math.ca] 8 Nov 2018

arxiv: v3 [math.ca] 8 Nov 2018 RESTRICTIONS OF HIGHER DERIVATIVES OF THE FOURIER TRANSFORM MICHAEL GOLDBERG AND DMITRIY STOLYAROV arxiv:1809.04159v3 [math.ca] 8 Nov 018 Abstract. We consider severa probems reated to the restriction

More information

Statistics for Applications. Chapter 7: Regression 1/43

Statistics for Applications. Chapter 7: Regression 1/43 Statistics for Appications Chapter 7: Regression 1/43 Heuristics of the inear regression (1) Consider a coud of i.i.d. random points (X i,y i ),i =1,...,n : 2/43 Heuristics of the inear regression (2)

More information

Reconstructions that Combine Cell Average Interpolation with Least Squares Fitting

Reconstructions that Combine Cell Average Interpolation with Least Squares Fitting App Math Inf Sci, No, 7-84 6) 7 Appied Mathematics & Information Sciences An Internationa Journa http://dxdoiorg/8576/amis/6 Reconstructions that Combine Ce Average Interpoation with Least Squares Fitting

More information