Efficient Sampling for Learning Sparse Additive Models in High Dimensions
|
|
- Mark Cooper
- 5 years ago
- Views:
Transcription
1 Efficient Samping for Learning Sparse Additive Modes in High Dimensions Hemant Tyagi ETH Zürich Andreas Krause ETH Zürich Bernd Gärtner ETH Zürich Abstract We consider the probem of earning sparse additive modes, i.e., functions of the form: f() = S φ ( ), R d from point queries of f. Here S is an unknown subset of coordinate variabes with S = k d. Assuming φ s to be smooth, we propose a set of points at which to sampe f and an efficient randomized agorithm that recovers a uniform approimation to each unknown φ. We provide a rigorous theoretica anaysis of our scheme aong with sampe compeity bounds. Our agorithm utiizes recent resuts from compressive sensing theory aong with a nove conve quadratic program for recovering robust uniform approimations to univariate functions, from point queries corrupted with arbitrary bounded noise. Lasty we theoreticay anayze the impact of noise either arbitrary but bounded, or stochastic on the performance of our agorithm. 1 Introduction Severa probems in science and engineering require estimating a rea-vaued, non-inear (and often non-conve) function f defined on a compact subset of R d in high dimensions. This chaenge arises, e.g., when characterizing compe engineered or natura (e.g., bioogica) systems [1, 2, 3]. The numerica soution of such probems invoves earning the unknown f from point evauations ( i, f( i )) n i=1. Unfortunatey, if the ony assumption on f is of mere smoothness, then the probem is in genera intractabe. For instance, it is we known [4] that if f is C s -smooth then n = Ω((1/δ) d/s ) sampes are needed for uniformy approimating f within error 0 < δ < 1. This eponentia dependence on d is referred to as the curse of dimensionaity. Fortunatey, many functions arising in practice are much better behaved in the sense that they are intrinsicay ow-dimensiona, i.e., depend on ony a sma subset of the d variabes. Estimating such functions has received much attention and has ed to a considerabe amount of theory aong with agorithms that do not suffer from the curse of dimensionaity (cf., [5, 6, 7, 8]). Here we focus on the probem of earning one such cass of functions, assuming f possesses the sparse additive structure: f( 1, 2,..., d ) = S φ ( ); S {1,..., d}, S = k d. (1.1) Functions of the form (1.1) are referred to as sparse additive modes (SPAMs) and generaize sparse inear modes to which they reduce to if each φ is inear. The probem of estimating SPAMs has received considerabe attention in the regression setting (cf., [9, 10, 11] and references within) where ( i, f( i )) n i=1 are typicay i.i.d sampes from some unknown probabiity measure P. This setting, however, does not consider the possibiity of samping f at specificay chosen points, taiored to the additive structure of f. In this paper, we propose a strategy for querying f, together with an efficient recovery agorithm, with much stronger guarantees than known in the regression setting. In particuar, we provide the first resuts guaranteeing uniformy accurate recovery of each individua component φ of the SPAM. This can be crucia in appications where the goa is to not merey approimate f, but gain insight into its structure. 1
2 Reated work. SPAMs have been studied etensivey in the regression setting, with observations being corrupted with random noise. [9] proposed the COSSO method, which is an etension of the Lasso to the reproducing kerne Hibert space (RKHS) setting. A simiiar etension was considered in [10]. In [12], the authors propose a east squares method reguarized with smoothness, with each φ ying in an RKHS and derive error rates for estimating f, in the L 2 (P) norm 1. [13, 14] propose methods based on east squares oss reguarized with sparsity and smoothness constraints. [13] proves consistency of its method in terms of mean squared risk whie [14] derives error rates for estimating f in the empirica L 2 (P n ) norm 1. [11] considers the setting where each φ ies in an RKHS. They propose a conve program for estimating f and derive error rates for the same, in the L 2 (P), L 2 (P n ) norms. Furthermore they estabish the minima optimaity of their method for the L 2 (P) norm. For instance, they derive an error rate of O((k og d/n) + kn 2s 2s+1 ) in the L 2 (P) norm for estimating C s smooth SPAMs. An estimator simiar to the one in [11] was aso considered by [15]. They derive simiar error rates as in [11], abeit under stronger assumptions on f. There is further reated work in approimation theory, where it is assumed that f can be samped at a desired set of points. [5] considers a setting more genera than (1.1), with f simpy assumed to depend on an unknown subset of k d-coordinate variabes. They construct a set of samping points of size O(c k og d) for some constant c > 0, and present an agorithm that recovers a uniform approimation 2 to f. This mode is generaized in [8], with f assumed to be of the form f() = g(a) for unknown A R k d ; each row of A is assumed to be sparse. [7] generaizes this, by removing the sparsity assumption on A. Whie the methods of [5, 8, 7] coud be empoyed for earning SPAMs, their samping sets wi be of size eponentia in k, and hence sub-optima. Furthermore, whie these methods derive uniform approimations to f, they are unabe to recover the individua φ s. Our contributions. Our contributions are threefod: 1. We propose an efficient agorithm that queries f at O(k og d) ocations and recovers: (i) the active set S aong with (ii) a uniform approimation to each φ, S. In contrast, the eisting error bounds in the statistics community [11, 12, 15] are in the much weaker L 2 (P) sense. Furthermore, the eisting theory in both statistics and approimation theory provides epicit error bounds for recovering f and not the individua φ s. 2. An important component of our agorithm is a nove conve quadratic program for estimating an unknown univariate function from point queries corrupted with arbitrary bounded noise. We derive rigorous error bounds for this program in the L norm that demonstrate the robustness of the soution returned. We aso epicity demonstrate the effect of noise, samping density and the curvature of the function on the soution returned. 3. We theoreticay anayze the impact of additive noise in the point queries on the performance of our agorithm, for two noise modes: arbitrary bounded noise and stochastic (iid) noise. In particuar for additive Gaussian noise, we show that our agorithm recovers a robust uniform approimation to each φ with at most O(k 3 (og d) 2 ) point queries of f. We aso provide simuation resuts that vaidate our theoretica findings. 2 Probem statement For any function g we denote its p th derivative by g (p) when p is arge, ese we use appropriate number of prime symbos. g L [a,b] denotes the L norm of g in [a, b]. For a vector we denote its q norm for 1 q by q. We consider approimating functions f : R d R from point queries. In particuar, for some unknown active S {1,..., d} with S = k d, we assume f to be of the additive form: f( 1,..., d ) = S φ ( ). Here φ : R R are the individua univariate components of the mode. Our goa is to query f at suitaby chosen points in its domain in order to recover an estimate φ est, of φ in a compact subset Ω R for each S. We measure the approimation error in the L norm. For simpicity, we assume that Ω = [ 1, 1], meaning that we guarantee an upper 1 f 2 L 2 (P) = R f() 2 dp() and f 2 L 2 (P n) = 1 n P i f 2 ( i) 2 This means in the L norm 2
3 bound on: φ est, φ L [ 1,1] ; S. Furthermore, we assume that we can query f from a sight enargement: [ (1 + r), (1 + r)] d of [ 1, 1] d for 3 some sma r > 0. As wi be seen ater, the enargement r can be made arbitrariy cose to 0. We now ist our main assumptions for this probem. 1. Each φ is assumed to be sufficienty smooth. In particuar we assume that φ C 5 [ (1 + r), (1 + r)] where C 5 denotes five times continuous differentiabiity. Since [ (1 + r), (1 + r)] is compact, this impies that there eist constants B 1,..., B 5 0 so that ma S φ(p) L [ (1+r),(1+r)] B p ; p = 1,..., 5. (2.1) 1 2. We assume each φ to be centered in the interva [ 1, 1], i.e. 1 φ (t)dt = 0; S. Such a condition is necessary for unique identification of φ. Otherwise one coud simpy repace each φ with φ + a for a R where a = 0 and unique identification wi not be possibe. 3. We require that for each φ, I [ 1, 1] with I being connected and µ(i ) δ so that φ () D ; I. Here µ(i) denotes the Lebesgue measure of I and δ, D > 0 are constants assumed to be known to the agorithm. This assumption essentiay enabes us to detect the active set S. If say φ was zero or cose to zero throughout [ 1, 1] for some S, then due to Assumption 2 this woud impy that φ is zero or cose to zero. We remark that it suffices to use estimates for our probem parameters instead of eact vaues. In particuar we can use upper bounds for: k, B p ; p = 1,..., 5 and ower bounds for the parameters: D, δ. Our methods and resuts stated in the coming sections wi remain unchanged. 3 Our samping scheme and agorithm In this section, we first motivate and describe our samping scheme for querying f. We then outine our agorithm and epain the intuition behind its different stages. Consider the Tayor epansion of f at any point ξ R d aong the direction v R d with step size: ɛ > 0. For any C p smooth f; p 2, we obtain for ζ = ξ + θv for some 0 < θ < ɛ the foowing epression: f(ξ + ɛv) f(ξ) = v, f(ξ) + 1 ɛ 2 ɛvt 2 f(ζ)v. (3.1) Note that (3.1) can be interpreted as taking a noisy inear measurement of f(ξ) with the measurement vector v and the noise being the Tayor remainder term. Importanty, due to the sparse additive form of f, we have φ 0, / S, impying that f(ξ) = [φ 1(ξ 1 ) φ 2(ξ 2 )... φ d (ξ d)] is at most k-sparse. Hence (3.1) actuay represents a noisy inear measurement of the k-sparse vector : f(ξ). For any fied ξ, we know from compressive sensing (CS) [16, 17] that f(ξ) can be recovered (with high probabiity) using few random inear measurements 4. This motivates the foowing sets of points using which we query f as iustrated in Figure 1. For integers m, m v > 0 we define { X := ξ i = i } (1, 1,..., 1) T R d : i = m,..., m, (3.2) m { V := v j R d : v j, = ± 1 } w.p. 1/2 each; j = 1,..., m v and = 1,..., d. (3.3) mv Using (3.1) at each ξ i X and v j V for i = m,..., m and j = 1,..., m v eads to: f(ξ i + ɛv j ) f(ξ i ) 1 = v j, f(ξ i ) + }{{ ɛ }}{{} 2 ɛvt j 2 f(ζ i,j )v j, (3.4) y i }{{} i,j n i,j 3 In case f : [a, b] d R we can define g : [ 1, 1] d R where g() = f( (b a) + b+a ) = P φ 2 2 S ( ) with φ ( ) = φ ( (b a) 2 + b+a ). We then sampe g from within [ (1 + r), (1 + 2 r)]d for some sma r > 0 by querying f, and estimate φ in [ 1, 1] which in turn gives an estimate to φ in [a, b]. 4 Estimating sparse gradients via compressive sensing has been considered previousy by Fornasier et a. [8] abeit for a substantiay different function cass than us. Hence their samping scheme differs consideraby from ours, and is not taiored for earning SPAMs. 3
4 where i = f(ξ i ) = [φ 1(i/m ) φ 2(i/m )... φ d (i/m )] is k-sparse. Let us denote V = [v 1... v mv ] T, y i = [y i,1... y i,mv ] and n i = [n i,1... n i,mv ]. Then for each i, we can write (3.4) in the succinct form: y i = V i + n i. (3.5) Here V R mv d represents the inear measurement matri, y i R mv denotes the measurement vector at ξ i and n i represents noise on account of non-inearity of f. Note that we query f at X ( V + 1) = (2m + 1)(m v + 1) many points. Given y i, V we can recover a robust approimation to i via 1 minimization [16, 17]. On account of the structure of f, we thus recover noisy estimates to φ at equispaced points aong the interva [ 1, 1]. We are now in a position to formay present our agorithm for earning SPAMs. Our agorithm for earning SPAMs. The steps invoved in our earning scheme are outined in Agorithm 1. Steps 1-4 invove the CS-based recovery stage wherein we use the aforementioned samping sets to formuate our probem as a CS one. Step 4 invoves a simpe threshoding procedure where an appropriate threshod τ is empoyed to recover the unknown active set S. In Section 4 we provide precise conditions on our samping parameters which guarantee eact recovery, i.e. Ŝ = S. Step 5 everages a conve quadratic program (P), that uses noisy estimates of φ (i/m ), i.e., i, for each Ŝ and i = m,..., m, to return a cubic spine estimate φ. This program and its theoretica properties are epained in Section 4. Finay, in Step 6 we derive our fina estimate φ est, via piecewise integration of φ for each Ŝ. Hence our fina estimate of φ is a spine of degree 4. The performance of Agorithm 1 for recovering S and the individua φ s is presented in Theorem 1, which is aso our first main resut. A proofs are deferred to the appendi. ( ) ( ) Figure 1: The points ξ i X (bue disks) and ξ i + ɛv j (red arrows) for v j V. Agorithm 1 Agorithm for earning φ in the SPAM: f() = S φ ( ) 1: Choose m, m v and construct samping sets X and V as in (3.2), (3.3). 2: Choose step size ɛ > 0. Query f at f(ξ i ),f(ξ i +ɛv j ) for i = m,..., m and j = 1,..., m v. 3: Construct y i where y i,j = f(ξi+ɛvj) f(ξi) ɛ for i = m,..., m and j = 1,..., m v. 4: Set i := argmin z 1. For τ > 0 compute Ŝ = m i= m { {1,..., d} : i, > τ}. y i=vz 5: For each Ŝ, run (P) as defined in Section 4 using ( i,) m i= m, τ and some smoothing parameter γ 0, to obtain φ. 6: For each Ŝ, set φ est, to be the piece-wise integra of φ as epained in Section 4. Theorem 1. There eist constants C, C 1 > 0 such that if m (1/δ), m v C 1 k og d, 0 < ɛ < D m v and τ = CɛkB2 then with high probabiity, Ŝ = S and for any γ 0 the estimate φ est, returned by Agorithm 1 satisfies for each S: φ est, φ L [ 1,1] [59(1 + γ)] CɛkB mv 64m 4 φ (5) L [ 1,1]. (3.6) Reca that k, B 2, D, δ are our probem parameters introduced in Section 2, whie ɛ is the step size parameter from (3.4). We see that with O(k og d) point queries of f and with ɛ < D m v, the active set is recovered eacty. The error bound in (3.6) hods for a such choices of ɛ. It is a sum of two terms in which the first one arises during the estimation of f during the CS stage. The second error term is the interpoation error bound for interpoating φ from its sampes in the noise-free setting. We note that our point queries ie in [ (1 + (ɛ/ m v )), (1 + (ɛ/ m v ))] d. For the stated condition on ɛ in Theorem 1 we have ɛ/ m v < D which can be made arbitrariy cose to zero by choosing an appropriatey sma ɛ. Hence we sampe from ony a sma enargement of [ 1, 1] d. 4
5 4 Anayzing the agorithm We now describe and anayze in more detai the individua stages of Agorithm 1. We first anayze Steps 1-4 which constitute the compressive sensing (CS) based recovery stage. Net, we anayze Step 5 where we aso introduce our conve quadratic program. Lasty, we anayze Step 6 where we derive our fina estimate φ est,. Compressive sensing-based recovery stage. This stage of Agorithm 1 invoves soving a sequence of inear programs for recovering estimates of i = [φ 1(i/m )... φ d (i/m )] for i = m,..., m. We note that the measurements y i are noisy inear measurements of i with the noise being arbitrary and bounded. For such a noise mode, it is known that 1 minimization resuts in robust recovery of the sparse signa [18]. Using this resut in our setting aows us to quantify the recovery error i i 2 as specified in Lemma 1. Lemma 1. There eist constants c 3 1 and C, c 1 > 0 such that for m v satisfying c 3k og d < m v < d/(og 6) 2 we have with probabiity at east 1 e c 1 mv e m vd that i satisfies i i 2 CɛkB 2 for a i = m,..., m. Furthermore, given that this hods and m 1/δ is satisfied we then have for any ɛ < D m v that the choice τ = CɛkB2 impies that Ŝ = S. Thus upon using 1 minimization based decoding at 2m + 1 points, we recover robust estimates i to i which immediatey gives us estimates φ (i/m ) = i, of φ (i/m ) for i = m,..., m and = 1,..., d. In order to recover the active set S, we first note that the spacing between consecutive sampes in X is 1/m. Therefore the condition m 1/δ impies on account of Assumption 3 that the sampe spacing is fine enough to ensure that for each S, there eists a sampe i for which φ (i/m ) D hods. The stated choice of the step size ɛ essentiay guarantees / S, i that φ (i/m ) ies within a sufficienty sma neighborhood of the origin in turn enabing detection of the active set. Therefore after this stage of Agorithm 1, we have at hand: the active set S aong with the estimates: ( φ (i/m )) m i=m for each S. Furthermore, it is easy to see that φ (i/m ) φ (i/m ) τ = CɛkB2, S, i. Robust estimation via cubic spines. Our aim now is to recover a smooth, robust estimate to φ by using the noisy sampes ( φ (i/m )) m i=m. Note that the noise here is arbitrary and bounded by τ = CɛkB2. To this end we choose to use cubic spines as our estimates, which are essentiay piecewise cubic poynomias that are C 2 smooth [19]. There is a considerabe amount of iterature in the statistics community devoted to the probem of estimating univariate functions from noisy sampes via cubic spines (cf., [20, 21, 22, 23]), abeit under the setting of random noise. Cubic spines have aso been studied etensivey in the approimation theoretic setting for interpoating sampes (cf., [19, 24, 25]). We introduce our soution to this probem in a more genera setting. Consider a smooth function g : [t 1, t 2 ] R and a uniform mesh 5 : : t 1 = 0 < 1 < < n 1 < n = t 2 with i i 1 = h. We have at hand noisy sampes: ĝ i = g( i ) + e i, with noise e i being arbitrary and bounded: e i τ. In the noiseess scenario, the probem woud be an interpoation one for which a popuar cass of cubic spines are the not-a-knot cubic spines [24]. These achieve optima O(h 4 ) error rates for C 4 smooth g without using any higher order information about g as boundary conditions. Let H 2 [t 1, t 2 ] denote the space of cubic spines defined on [t 1, t 2 ] w.r.t. We then propose finding the cubic spine estimate as a soution of the foowing conve optimization probem (in the 4n coefficients of the n cubic poynomias) for some parameter γ 0: t2 min L () 2 d L H 2 [t 1,t 2] t 1 (P) s.t. ĝ i γτ L( i ) ĝ i + γτ; i = 0,..., n, L ( 1 ) = L ( + 1 ), L ( n 1 ) = L ( + n 1 ). 5 We consider uniform meshes for carity of eposition. The resuts in this section can be easiy generaized to non-uniform meshes. (4.1) (4.2) (4.3) 5
6 Note that (P) is a conve QP with inear constraints. The objective function can be verified to be a positive definite quadratic form in the spine coefficients 6. Specificay, the objective measures the tota curvature of a feasibe cubic spine in [t 1, t 2 ]. Each of the constraints (4.2)-(4.3) aong with the impicit continuity constraints of L (p) ; p = 0, 1, 2 at the interior points of, are inear equaities/inequaities in the coefficients of the piecewise cubic poynomias. (4.3) refers to the nota-knot boundary conditions [24] which are aso inear equaities in the spine coefficients. These conditions impy that L is continuous 7 at the knots 1, n 1. Thus, (P) searches amongst the space of a not-a-knot cubic spines such that L( i ) ies within a ±γτ interva of ĝ i, and returns the smoothest soution, i.e., the one with the east tota curvature. The parameter γ 0, contros the degree of smoothness of the soution. Ceary, γ = 0 impies interpoating the noisy sampes (ĝ i ) n i=0. As γ increases, the search interva: [ĝ i γτ, ĝ i + γτ] becomes arger for a i, eading to smoother feasibe cubic spines. The foowing theorem formay describes the estimation properties of (P) and is aso our second main resut. Theorem 2. For g C 4 [t 1, t 2 ] et L : [t 1, t 2 ] R be a soution of (P) for some parameter γ 0. We then have that [ ] 118(1 + γ) L g τ h4 g (4). (4.4) We show in the appendi that if t 2 t 1 (L ()) 2 d > 0, then L is unique. Note that the error bound (4.4) is a sum of two terms. The first term is proportiona to the eterna noise bound: τ, indicating that the soution is robust to noise. The second term is the error that woud arise even if perturbation was absent i.e. τ = 0. Intuitivey, if γτ is arge enough, then we woud epect the soution returned by (P) to be a ine. Indeed, a arger vaue of γτ woud impy a arger search interva in (4.2), which if sufficienty arge, woud aow a ine (that has zero curvature) to ie in the feasibe region. More formay, we show in the appendi, sufficient conditions: τ = Ω( n1/2 g γ 1 ), γ > 1, which if satisfied, impy that the soution returned by (P) is a ine. This indicates that if either n is sma or g has sma curvature, then moderatey arge vaues of τ and/or γ wi cause the soution returned by (P) to be a ine. If an estimate of g is avaiabe, then one coud for instance, use the upper bound 1 + O(n 1/2 g /τ) to restrict the range of vaues of γ within which (P) is used. Theorem 2 has the foowing Coroary for estimation of C 4 smooth φ in the interva [ 1, 1]. The proof simpy invoves repacing: g with φ, n + 1 with 2m + 1, h with 1/m and τ with CɛkB2. As the perturbation τ is directy proportiona to the step size ɛ, we show in the appendi that if additionay ɛ = Ω( m m v φ γ 1 ), γ > 1, hods, then the corresponding estimate φ wi be a ine. { Coroary 1. Let (P) be empoyed for each S using noisy sampes φ (i/m ), and i= m with step size ɛ satisfying 0 < ɛ < D m v. Denoting φ as the corresponding soution returned by (P), we then have for any γ 0 that: [ ] φ 59(1 + γ) φ CɛkB2 L [ 1,1] mv 64m 4 φ (5) L [ 1,1]. (4.5) The fina estimate. We now derive the fina estimate φ est, of φ for each S. Denote 0 (= 1) < 1 < < 2m 1 < 2m (= 1) as our equispaced set of points on [ 1, 1]. Since φ : [ 1, 1] R returned by (P) is a cubic spine, we have φ () = φ,i() for [ i, i+1 ] where φ,i is a poynomia of degree at most 3. We then define φ est, () := φ,i () + F i for [ i, i+1 ] and i = 0,..., 2m 1. Here φ,i is a antiderivative of φ,i and F i s are constants of integration. Denoting F 0 = F, we have that φ est, is continuous at 1,..., 2m 1 for: F i = φ,0 ( 1 ) + i 1 j=1 ( φ,j ( j+1 ) φ,j ( j )) φ,i ( i ) + F = F i + F ; 1 i 2m 1. Hence by denoting ψ,i ( ) := φ,i ( ) + F i we obtain φ est,( ) = ψ ( ) + F where ψ () = ψ,i () for 6 Shown in the appendi. 7 f( ) = im h 0 f( + h) and f( + ) = im h 0 + f( + h) denote eft,right hand imits respectivey. } m 6
7 [ i, i+1 ]. Now on account of Assumption 2, we require φ est, to aso be centered impying F = ψ ()d. Hence we output our fina estimate of φ to be: φ est, () := ψ () ψ ()d; [ 1, 1]. (4.6) Since φ est, is by construction continuous in [ 1, 1], is a piecewise combination of poynomias of degree at most 4, and since φ est, is a cubic spine, φ est, is a spine function of order 4. Lasty, we show in the proof of Theorem 1 that φ est, φ L [ 1,1] 3 φ φ L [ 1,1] hods. Using Coroary 1, this provides us with the error bounds stated in Theorem 1. 5 Impact of noise on performance of our agorithm Our third main contribution invoves anayzing the more reaistic scenario, when the point queries are corrupted with additive eterna noise z. Thus querying f in Step 2 of Agorithm 1 resuts in noisy vaues: f(ξ i ) + z i and f(ξ i + ɛv j ) + z i,j respectivey. This changes (3.5) to the noisy inear system: y i = V i + n i + z i where z i,j = (z i,j z i )/ɛ for i = m,..., m and j = 1,..., m v. Notice that eterna noise gets scaed by (1/ɛ), whie n i,j scaes ineary with ɛ. Arbitrary bounded noise. In this mode, the eterna noise is arbitrary but bounded, so that z i, z i,j < κ; i, j. It can be verified aong the ines of the proof of Lemma 1 that: ni + z i 2 ( 2κ mv ɛ + ɛkb2 2m v ). Observe that unike the noiseess setting, ɛ cannot be made arbitrariy cose to 0, as it woud bow up the impact of the eterna noise. The foowing theorem shows that if κ is sma reative to D 2 < φ () 2, I, S, then 8 there eists an interva for choosing ɛ, within which Agorithm 1 recovers eacty the active set S. This condition has the natura interpretation that if the signa-to- eterna noise ratio in I is sufficienty arge, then S can be detected eacty. Theorem 3. There eist constants C, C 1 > 0 such that if κ < D 2 /(16C 2 kb 2 ), m (1/δ), and m v C 1 k og d hod, then for any ɛ D m v 2 [1 A, 1 + A] where A := 1 (16C 2 kb 2 κ)/d 2 and τ = ( 2κ m v ɛ + ɛkb2 2m v ), we have in Agorithm 1, with high probabiity, that Ŝ = S and for any γ 0, for each S: ( 4C mv κ φ est, φ L [ 1,1] [59(1 + γ)] ɛ + CɛkB ) mv 64m 4 φ (5) L [ 1,1]. (5.1) Stochastic noise. In this mode, the eterna noise is assumed to be i.i.d. Gaussian, so that z i, z i,j N (0, σ2 ); i.i.d. i, j. In this setting we consider resamping f at the query point N times and then averaging the noisy sampes, in order to reduce σ. Given this, we now have that z i, z σ2 i,j N (0, N ); i.i.d. i, j. Using standard tai-bounds for Gaussians, we can show that for any κ > 0 if N is chosen arge enough then: z i,j = z i z i,j 2κ; i, j with high probabiity. Hence the eterna noise z i,j woud be bounded with high probabiity and the anaysis for Theorem 3 can be used in a straightforward manner. Of course, an advantage that we have in this setting is that κ can be chosen to be arbitrariy cose to zero by choosing a correspondingy arge vaue of N. We state a this formay in the form of the foowing theorem. Theorem 4. There eist constants C, C 1 > 0 such that for κ < D 2 /(16C 2 kb 2 ), m ( (1/δ), and 2σ ) m v C 1 k og d, if we re-sampe each query in Step 2 of Agorithm 1: N > σ2 κ og 2 κp X V times for 0 < p < 1, and average the vaues, then for any ɛ D m v 2 [1 A, 1 + A] where A := 1 (16C2 kb 2 κ)/d 2 and τ = ( 2κ m v ɛ + ɛkb2 2m v ), we have in Agorithm 1, with probabiity at east 1 p o(1), that Ŝ = S and for any γ 0, for each S: ( 4C mv κ φ est, φ L [ 1,1] [59(1 + γ)] + CɛkB ) ɛ mv 64m 4 φ (5) L [ 1,1]. (5.2) 8 I is the critica interva defined in Assumption 3 for detecting S. 7
8 Note that we query f now N X ( V + 1) times. Aso, X = (2m + 1) = Θ(1), and κ = O(k 1 ), as D, C, B 2, δ are constants. Hence the choice V = O(k og d) gives us N = O(k 2 og(p 1 k 2 og d)) and eads to an overa query compeity of: O(k 3 og d og(p 1 k 2 og d)) when the sampes are corrupted with additive Gaussian noise. Choosing p = O(d c ) for any constant c > 0 gives us a sampe compeity of O(k 3 (og d) 2 ), and ensures that the resut hods with high probabiity. The o(1) term goes to zero eponentiay fast as d. Simuation resuts. We now provide simuation resuts on synthetic data to support our theoretica findings. We consider the noisy setting with the point queries being corrupted with Gaussian noise. For d = 1000, k = 4 and S = {2, 105, 424, 782}, consider f : R d R where f = φ 2 ( 2 ) + φ 105 ( 105 ) + φ 424 ( 424 ) + φ 782 ( 782 ) with: φ 2 () = sin(π), φ 105 () = ep( 2), φ 424 () = (1/3) cos 3 (π) , φ 782 () = We choose δ = 0.3, D = 0.2 which can be verified as vaid parameters for the above φ s. Furthermore, we choose m = 2/δ = 7 and m v = 2k og d = 56 to satisfy the conditions of Theorem 4. Net, we choose constants D C = 0.2, B 2 = 35 and κ = C 2 kb 2 = as required by Theorem 4. For the choice ɛ = D m v 2 = , we then query f at (2m + 1)(m v + 1) = 855 points. The function vaues are corrupted with Gaussian noise: N (0, σ 2 /N) for σ = 0.01 and N = 100. This is equivaent to resamping and averaging the points queries N times. Importanty the sufficient condition on N, as stated in Theorem 4 is σ2 κ og( 2σ X V 2 κp ) = 6974 for p = 0.1. Thus we consider a significanty undersamped regime. Lasty we seect the threshod τ = ( ) 2κ m v ɛ + ɛkb2 2m v = as stated by Theorem 4, and empoy Agorithm 1 for different vaues of the smoothing parameter γ φ2, φest, φ105, φest, φ424, φest, φ782, φest, (a) Estimates of φ (b) Estimates of φ (c) Estimates of φ (d) Estimates of φ 782 Figure 2: Estimates φ est, of φ (back) for: γ = 0.3 (red), γ = 1 (bue) and γ = 5 (green). The resuts are shown in Figure 2. Over 10 independent runs of the agorithm we observed that S was recovered eacty each time. Furthermore we see from Figure 2 that the recovery is quite accurate for γ = 0.3. For γ = 1 we notice that the search interva γτ = becomes arge enough so as to cause the estimates φ est,424, φ est,782 to become reativey smoother. For γ = 5, the search interva γτ = becomes wide enough for a ine to fit in the feasibe region for φ 424, φ 782. This resuts in φ est,424, φ est,782 to be quadratic functions. In the case of φ 2, φ 105, the search interva is not sufficienty wide enough for a ine to ie in the feasibe region, even for γ = 5. However we notice that the estimates φ est,2, φ est,105 become reativey smoother as epected. 6 Concusion We proposed an efficient samping scheme for earning SPAMs. In particuar, we showed that with ony a few queries, we can derive uniform approimations to each underying univariate function of the SPAM. A crucia component of our approach is a nove conve QP for robust estimation of univariate functions via cubic spines, from sampes corrupted with arbitrary bounded noise. Lasty, we showed how our agorithm can hande noisy point queries for both (i) arbitrary bounded and (ii) i.i.d. Gaussian noise modes. An important direction for future work woud be to determine the optimaity of our samping bounds by deriving corresponding ower bounds on the sampe compeity. Acknowedgments. This research was supported in part by SNSF grant and a Microsoft Research Facuty Feowship. 8
9 References [1] Th. Muer-Gronbach and K. Ritter. Minima errors for strong and weak approimation of stochastic differentia equations. Monte Caro and Quasi-Monte Caro Methods, pages 53 82, [2] M.H. Maathuis, M. Kaisch, and P. Bühmann. Estimating high-dimensiona intervention effects from observationa data. The Annas of Statistics, 37(6A): , [3] M.J. Wainwright. Information-theoretic imits on sparsity recovery in the high-dimensiona and noisy setting. Information Theory, IEEE Transactions on, 55(12): , [4] J.F. Traub, G.W. Wasikowski, and H. Wozniakowski. Information-Based Compeity. Academic Press, New York, [5] R. DeVore, G. Petrova, and P. Wojtaszczyk. Approimation of functions of few variabes in high dimensions. Constr. Appro., 33: , [6] A. Cohen, I. Daubechies, R.A. DeVore, G. Kerkyacharian, and D. Picard. Capturing ridge functions in high dimensions from point queries. Constr. Appro., pages 1 19, [7] H. Tyagi and V. Cevher. Active earning of muti-inde function modes. Advances in Neura Information Processing Systems 25, pages , [8] M. Fornasier, K. Schnass, and J. Vybíra. Learning functions of few arbitrary inear parameters in high dimensions. Foundations of Computationa Mathematics, 12(2): , [9] Y. Lin and H.H. Zhang. Component seection and smoothing in mutivariate nonparametric regression. The Annas of Statistics, 34(5): , [10] M. Yuan. Nonnegative garrote component seection in functiona anova modes. In AISTATS, voume 2, pages , [11] G. Raskutti, M.J. Wainwright, and B. Yu. Minima-optima rates for sparse additive modes over kerne casses via conve programming. J. Mach. Learn. Res., 13(1): , [12] V. Kotchinskii and M. Yuan. Sparse recovery in arge ensembes of kerne machines. In COLT, pages , [13] P. Ravikumar, J. Lafferty, H. Liu, and L. Wasserman. Sparse additive modes. Journa of the Roya Statistica Society: Series B (Statistica Methodoogy), 71(5): , [14] L. Meier, S. Van De Geer, and P. Bühmann. High-dimensiona additive modeing. The Annas of Statistics, 37(6B): , [15] V. Kotchinskii and M. Yuan. Sparsity in mutipe kerne earning. The Annas of Statistics, 38(6): , [16] E.J. Candès, J.K. Romberg, and T. Tao. Stabe signa recovery from incompete and inaccurate measurements. Communications on Pure and Appied Mathematics, 59(8): , [17] D.L. Donoho. Compressed sensing. IEEE Transactions on Information Theory, 52(4): , [18] P. Wojtaszczyk. 1 minimization with noisy data. SIAM Journa on Numerica Anaysis, 50(2): , [19] J.H. Ahberg, E.N. Nison, and J.L. Wash. The theory of spines and their appications. Academic Press (New York), [20] I.J. Schoenberg. Spine functions and the probem of graduation. Proceedings of the Nationa Academy of Sciences, 52(4): , [21] C.M. Reinsch. Smoothing by spine functions. Numer. Math, 10: , [22] G. Wahba. Smoothing noisy data with spine functions. Numerische Mathematik, 24(5): , [23] P. Craven and G. Wahba. Smoothing noisy data with spine functions. Numerische Mathematik, 31(4): , [24] C. de Boor. A practica guide to spines. Springer Verag (New York), [25] C.A. Ha and W.W. Meyer. Optima error bounds for cubic spine interpoation. Journa of Approimation Theory, 16(2): ,
Statistical Learning Theory: A Primer
Internationa Journa of Computer Vision 38(), 9 3, 2000 c 2000 uwer Academic Pubishers. Manufactured in The Netherands. Statistica Learning Theory: A Primer THEODOROS EVGENIOU, MASSIMILIANO PONTIL AND TOMASO
More informationMultilayer Kerceptron
Mutiayer Kerceptron Zotán Szabó, András Lőrincz Department of Information Systems, Facuty of Informatics Eötvös Loránd University Pázmány Péter sétány 1/C H-1117, Budapest, Hungary e-mai: szzoi@csetehu,
More informationA Brief Introduction to Markov Chains and Hidden Markov Models
A Brief Introduction to Markov Chains and Hidden Markov Modes Aen B MacKenzie Notes for December 1, 3, &8, 2015 Discrete-Time Markov Chains You may reca that when we first introduced random processes,
More informationA. Distribution of the test statistic
A. Distribution of the test statistic In the sequentia test, we first compute the test statistic from a mini-batch of size m. If a decision cannot be made with this statistic, we keep increasing the mini-batch
More informationA proposed nonparametric mixture density estimation using B-spline functions
A proposed nonparametric mixture density estimation using B-spine functions Atizez Hadrich a,b, Mourad Zribi a, Afif Masmoudi b a Laboratoire d Informatique Signa et Image de a Côte d Opae (LISIC-EA 4491),
More informationStatistical Learning Theory: a Primer
??,??, 1 6 (??) c?? Kuwer Academic Pubishers, Boston. Manufactured in The Netherands. Statistica Learning Theory: a Primer THEODOROS EVGENIOU AND MASSIMILIANO PONTIL Center for Bioogica and Computationa
More informationASummaryofGaussianProcesses Coryn A.L. Bailer-Jones
ASummaryofGaussianProcesses Coryn A.L. Baier-Jones Cavendish Laboratory University of Cambridge caj@mrao.cam.ac.uk Introduction A genera prediction probem can be posed as foows. We consider that the variabe
More informationSUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS
ISEE 1 SUPPLEMENTARY MATERIAL TO INNOVATED SCALABLE EFFICIENT ESTIMATION IN ULTRA-LARGE GAUSSIAN GRAPHICAL MODELS By Yingying Fan and Jinchi Lv University of Southern Caifornia This Suppementary Materia
More information4 Separation of Variables
4 Separation of Variabes In this chapter we describe a cassica technique for constructing forma soutions to inear boundary vaue probems. The soution of three cassica (paraboic, hyperboic and eiptic) PDE
More informationWAVELET LINEAR ESTIMATION FOR DERIVATIVES OF A DENSITY FROM OBSERVATIONS OF MIXTURES WITH VARYING MIXING PROPORTIONS. B. L. S.
Indian J. Pure App. Math., 41(1): 275-291, February 2010 c Indian Nationa Science Academy WAVELET LINEAR ESTIMATION FOR DERIVATIVES OF A DENSITY FROM OBSERVATIONS OF MIXTURES WITH VARYING MIXING PROPORTIONS
More informationWavelet shrinkage estimators of Hilbert transform
Journa of Approximation Theory 163 (2011) 652 662 www.esevier.com/ocate/jat Fu ength artice Waveet shrinkage estimators of Hibert transform Di-Rong Chen, Yao Zhao Department of Mathematics, LMIB, Beijing
More informationWeek 6 Lectures, Math 6451, Tanveer
Fourier Series Week 6 Lectures, Math 645, Tanveer In the context of separation of variabe to find soutions of PDEs, we encountered or and in other cases f(x = f(x = a 0 + f(x = a 0 + b n sin nπx { a n
More informationarxiv: v1 [math.ca] 6 Mar 2017
Indefinite Integras of Spherica Besse Functions MIT-CTP/487 arxiv:703.0648v [math.ca] 6 Mar 07 Joyon K. Boomfied,, Stephen H. P. Face,, and Zander Moss, Center for Theoretica Physics, Laboratory for Nucear
More informationStructural Control of Probabilistic Boolean Networks and Its Application to Design of Real-Time Pricing Systems
Preprints of the 9th Word Congress The Internationa Federation of Automatic Contro Structura Contro of Probabiistic Booean Networks and Its Appication to Design of Rea-Time Pricing Systems Koichi Kobayashi
More informationarxiv: v1 [cs.lg] 31 Oct 2017
ACCELERATED SPARSE SUBSPACE CLUSTERING Abofaz Hashemi and Haris Vikao Department of Eectrica and Computer Engineering, University of Texas at Austin, Austin, TX, USA arxiv:7.26v [cs.lg] 3 Oct 27 ABSTRACT
More informationMat 1501 lecture notes, penultimate installment
Mat 1501 ecture notes, penutimate instament 1. bounded variation: functions of a singe variabe optiona) I beieve that we wi not actuay use the materia in this section the point is mainy to motivate the
More informationCS229 Lecture notes. Andrew Ng
CS229 Lecture notes Andrew Ng Part IX The EM agorithm In the previous set of notes, we taked about the EM agorithm as appied to fitting a mixture of Gaussians. In this set of notes, we give a broader view
More informationSeparation of Variables and a Spherical Shell with Surface Charge
Separation of Variabes and a Spherica She with Surface Charge In cass we worked out the eectrostatic potentia due to a spherica she of radius R with a surface charge density σθ = σ cos θ. This cacuation
More informationSmoothness equivalence properties of univariate subdivision schemes and their projection analogues
Numerische Mathematik manuscript No. (wi be inserted by the editor) Smoothness equivaence properties of univariate subdivision schemes and their projection anaogues Phiipp Grohs TU Graz Institute of Geometry
More informationMelodic contour estimation with B-spline models using a MDL criterion
Meodic contour estimation with B-spine modes using a MDL criterion Damien Loive, Ney Barbot, Oivier Boeffard IRISA / University of Rennes 1 - ENSSAT 6 rue de Kerampont, B.P. 80518, F-305 Lannion Cedex
More informationExplicit overall risk minimization transductive bound
1 Expicit overa risk minimization transductive bound Sergio Decherchi, Paoo Gastado, Sandro Ridea, Rodofo Zunino Dept. of Biophysica and Eectronic Engineering (DIBE), Genoa University Via Opera Pia 11a,
More informationProbabilistic Graphical Models
Schoo of Computer Science Probabiistic Graphica Modes Gaussian graphica modes and Ising modes: modeing networks Eric Xing Lecture 0, February 0, 07 Reading: See cass website Eric Xing @ CMU, 005-07 Network
More informationAsynchronous Control for Coupled Markov Decision Systems
INFORMATION THEORY WORKSHOP (ITW) 22 Asynchronous Contro for Couped Marov Decision Systems Michae J. Neey University of Southern Caifornia Abstract This paper considers optima contro for a coection of
More information$, (2.1) n="# #. (2.2)
Chapter. Eectrostatic II Notes: Most of the materia presented in this chapter is taken from Jackson, Chap.,, and 4, and Di Bartoo, Chap... Mathematica Considerations.. The Fourier series and the Fourier
More informationMoreau-Yosida Regularization for Grouped Tree Structure Learning
Moreau-Yosida Reguarization for Grouped Tree Structure Learning Jun Liu Computer Science and Engineering Arizona State University J.Liu@asu.edu Jieping Ye Computer Science and Engineering Arizona State
More informationTracking Control of Multiple Mobile Robots
Proceedings of the 2001 IEEE Internationa Conference on Robotics & Automation Seou, Korea May 21-26, 2001 Tracking Contro of Mutipe Mobie Robots A Case Study of Inter-Robot Coision-Free Probem Jurachart
More informationA Comparison Study of the Test for Right Censored and Grouped Data
Communications for Statistica Appications and Methods 2015, Vo. 22, No. 4, 313 320 DOI: http://dx.doi.org/10.5351/csam.2015.22.4.313 Print ISSN 2287-7843 / Onine ISSN 2383-4757 A Comparison Study of the
More informationBayesian Learning. You hear a which which could equally be Thanks or Tanks, which would you go with?
Bayesian Learning A powerfu and growing approach in machine earning We use it in our own decision making a the time You hear a which which coud equay be Thanks or Tanks, which woud you go with? Combine
More informationOnline Load Balancing on Related Machines
Onine Load Baancing on Reated Machines ABSTRACT Sungjin Im University of Caifornia at Merced Merced, CA, USA sim3@ucmerced.edu Debmaya Panigrahi Duke University Durham, NC, USA debmaya@cs.duke.edu We give
More informationA Fundamental Storage-Communication Tradeoff in Distributed Computing with Straggling Nodes
A Fundamenta Storage-Communication Tradeoff in Distributed Computing with Stragging odes ifa Yan, Michèe Wigger LTCI, Téécom ParisTech 75013 Paris, France Emai: {qifa.yan, michee.wigger} @teecom-paristech.fr
More informationUniprocessor Feasibility of Sporadic Tasks with Constrained Deadlines is Strongly conp-complete
Uniprocessor Feasibiity of Sporadic Tasks with Constrained Deadines is Strongy conp-compete Pontus Ekberg and Wang Yi Uppsaa University, Sweden Emai: {pontus.ekberg yi}@it.uu.se Abstract Deciding the feasibiity
More informationHILBERT? What is HILBERT? Matlab Implementation of Adaptive 2D BEM. Dirk Praetorius. Features of HILBERT
Söerhaus-Workshop 2009 October 16, 2009 What is HILBERT? HILBERT Matab Impementation of Adaptive 2D BEM joint work with M. Aurada, M. Ebner, S. Ferraz-Leite, P. Godenits, M. Karkuik, M. Mayr Hibert Is
More informationA unified framework for Regularization Networks and Support Vector Machines. Theodoros Evgeniou, Massimiliano Pontil, Tomaso Poggio
MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES A.I. Memo No. 1654 March23, 1999
More informationAALBORG UNIVERSITY. The distribution of communication cost for a mobile service scenario. Jesper Møller and Man Lung Yiu. R June 2009
AALBORG UNIVERSITY The distribution of communication cost for a mobie service scenario by Jesper Møer and Man Lung Yiu R-29-11 June 29 Department of Mathematica Sciences Aaborg University Fredrik Bajers
More informationNOISE-INDUCED STABILIZATION OF STOCHASTIC DIFFERENTIAL EQUATIONS
NOISE-INDUCED STABILIZATION OF STOCHASTIC DIFFERENTIAL EQUATIONS TONY ALLEN, EMILY GEBHARDT, AND ADAM KLUBALL 3 ADVISOR: DR. TIFFANY KOLBA 4 Abstract. The phenomenon of noise-induced stabiization occurs
More informationFrom Margins to Probabilities in Multiclass Learning Problems
From Margins to Probabiities in Muticass Learning Probems Andrea Passerini and Massimiiano Ponti 2 and Paoo Frasconi 3 Abstract. We study the probem of muticass cassification within the framework of error
More informationProceedings of the 2012 Winter Simulation Conference C. Laroque, J. Himmelspach, R. Pasupathy, O. Rose, and A. M. Uhrmacher, eds.
Proceedings of the 2012 Winter Simuation Conference C. aroque, J. Himmespach, R. Pasupathy, O. Rose, and A. M. Uhrmacher, eds. TIGHT BOUNDS FOR AMERICAN OPTIONS VIA MUTIEVE MONTE CARO Denis Beomestny Duisburg-Essen
More informationORTHOGONAL MULTI-WAVELETS FROM MATRIX FACTORIZATION
J. Korean Math. Soc. 46 2009, No. 2, pp. 281 294 ORHOGONAL MLI-WAVELES FROM MARIX FACORIZAION Hongying Xiao Abstract. Accuracy of the scaing function is very crucia in waveet theory, or correspondingy,
More informationCryptanalysis of PKP: A New Approach
Cryptanaysis of PKP: A New Approach Éiane Jaumes and Antoine Joux DCSSI 18, rue du Dr. Zamenhoff F-92131 Issy-es-Mx Cedex France eiane.jaumes@wanadoo.fr Antoine.Joux@ens.fr Abstract. Quite recenty, in
More informationDIFFERENCE-OF-CONVEX LEARNING: DIRECTIONAL STATIONARITY, OPTIMALITY, AND SPARSITY
SIAM J. OPTIM. Vo. 7, No. 3, pp. 1637 1665 c 017 Society for Industria and Appied Mathematics DIFFERENCE-OF-CONVEX LEARNING: DIRECTIONAL STATIONARITY, OPTIMALITY, AND SPARSITY MIJU AHN, JONG-SHI PANG,
More informationStatistical Inference, Econometric Analysis and Matrix Algebra
Statistica Inference, Econometric Anaysis and Matrix Agebra Bernhard Schipp Water Krämer Editors Statistica Inference, Econometric Anaysis and Matrix Agebra Festschrift in Honour of Götz Trenker Physica-Verag
More informationNEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION
NEW DEVELOPMENT OF OPTIMAL COMPUTING BUDGET ALLOCATION FOR DISCRETE EVENT SIMULATION Hsiao-Chang Chen Dept. of Systems Engineering University of Pennsyvania Phiadephia, PA 904-635, U.S.A. Chun-Hung Chen
More information(This is a sample cover image for this issue. The actual cover is not yet available at this time.)
(This is a sampe cover image for this issue The actua cover is not yet avaiabe at this time) This artice appeared in a journa pubished by Esevier The attached copy is furnished to the author for interna
More informationThe Group Structure on a Smooth Tropical Cubic
The Group Structure on a Smooth Tropica Cubic Ethan Lake Apri 20, 2015 Abstract Just as in in cassica agebraic geometry, it is possibe to define a group aw on a smooth tropica cubic curve. In this note,
More informationC. Fourier Sine Series Overview
12 PHILIP D. LOEWEN C. Fourier Sine Series Overview Let some constant > be given. The symboic form of the FSS Eigenvaue probem combines an ordinary differentia equation (ODE) on the interva (, ) with a
More informationarxiv: v1 [math.fa] 23 Aug 2018
An Exact Upper Bound on the L p Lebesgue Constant and The -Rényi Entropy Power Inequaity for Integer Vaued Random Variabes arxiv:808.0773v [math.fa] 3 Aug 08 Peng Xu, Mokshay Madiman, James Mebourne Abstract
More informationAbsolute Value Preconditioning for Symmetric Indefinite Linear Systems
MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.mer.com Absoute Vaue Preconditioning for Symmetric Indefinite Linear Systems Vecharynski, E.; Knyazev, A.V. TR2013-016 March 2013 Abstract We introduce
More informationIntuitionistic Fuzzy Optimization Technique for Nash Equilibrium Solution of Multi-objective Bi-Matrix Games
Journa of Uncertain Systems Vo.5, No.4, pp.27-285, 20 Onine at: www.jus.org.u Intuitionistic Fuzzy Optimization Technique for Nash Equiibrium Soution of Muti-objective Bi-Matri Games Prasun Kumar Naya,,
More informationAkaike Information Criterion for ANOVA Model with a Simple Order Restriction
Akaike Information Criterion for ANOVA Mode with a Simpe Order Restriction Yu Inatsu * Department of Mathematics, Graduate Schoo of Science, Hiroshima University ABSTRACT In this paper, we consider Akaike
More informationDetermining The Degree of Generalization Using An Incremental Learning Algorithm
Determining The Degree of Generaization Using An Incrementa Learning Agorithm Pabo Zegers Facutad de Ingeniería, Universidad de os Andes San Caros de Apoquindo 22, Las Condes, Santiago, Chie pzegers@uandes.c
More informationTarget Location Estimation in Wireless Sensor Networks Using Binary Data
Target Location stimation in Wireess Sensor Networks Using Binary Data Ruixin Niu and Pramod K. Varshney Department of ectrica ngineering and Computer Science Link Ha Syracuse University Syracuse, NY 344
More informationMARKOV CHAINS AND MARKOV DECISION THEORY. Contents
MARKOV CHAINS AND MARKOV DECISION THEORY ARINDRIMA DATTA Abstract. In this paper, we begin with a forma introduction to probabiity and expain the concept of random variabes and stochastic processes. After
More informationSymbolic models for nonlinear control systems using approximate bisimulation
Symboic modes for noninear contro systems using approximate bisimuation Giordano Poa, Antoine Girard and Pauo Tabuada Abstract Contro systems are usuay modeed by differentia equations describing how physica
More informationNonlinear Gaussian Filtering via Radial Basis Function Approximation
51st IEEE Conference on Decision and Contro December 10-13 01 Maui Hawaii USA Noninear Gaussian Fitering via Radia Basis Function Approximation Huazhen Fang Jia Wang and Raymond A de Caafon Abstract This
More informationBayesian Unscented Kalman Filter for State Estimation of Nonlinear and Non-Gaussian Systems
Bayesian Unscented Kaman Fiter for State Estimation of Noninear and Non-aussian Systems Zhong Liu, Shing-Chow Chan, Ho-Chun Wu and iafei Wu Department of Eectrica and Eectronic Engineering, he University
More informationThe EM Algorithm applied to determining new limit points of Mahler measures
Contro and Cybernetics vo. 39 (2010) No. 4 The EM Agorithm appied to determining new imit points of Maher measures by Souad E Otmani, Georges Rhin and Jean-Marc Sac-Épée Université Pau Veraine-Metz, LMAM,
More informationTwo view learning: SVM-2K, Theory and Practice
Two view earning: SVM-2K, Theory and Practice Jason D.R. Farquhar jdrf99r@ecs.soton.ac.uk Hongying Meng hongying@cs.york.ac.uk David R. Hardoon drh@ecs.soton.ac.uk John Shawe-Tayor jst@ecs.soton.ac.uk
More informationSTA 216 Project: Spline Approach to Discrete Survival Analysis
: Spine Approach to Discrete Surviva Anaysis November 4, 005 1 Introduction Athough continuous surviva anaysis differs much from the discrete surviva anaysis, there is certain ink between the two modeing
More informationAn explicit Jordan Decomposition of Companion matrices
An expicit Jordan Decomposition of Companion matrices Fermín S V Bazán Departamento de Matemática CFM UFSC 88040-900 Forianópois SC E-mai: fermin@mtmufscbr S Gratton CERFACS 42 Av Gaspard Coriois 31057
More informationDiscrete Techniques. Chapter Introduction
Chapter 3 Discrete Techniques 3. Introduction In the previous two chapters we introduced Fourier transforms of continuous functions of the periodic and non-periodic (finite energy) type, we as various
More informationLecture Note 3: Stationary Iterative Methods
MATH 5330: Computationa Methods of Linear Agebra Lecture Note 3: Stationary Iterative Methods Xianyi Zeng Department of Mathematica Sciences, UTEP Stationary Iterative Methods The Gaussian eimination (or
More informationOn the evaluation of saving-consumption plans
On the evauation of saving-consumption pans Steven Vanduffe Jan Dhaene Marc Goovaerts Juy 13, 2004 Abstract Knowedge of the distribution function of the stochasticay compounded vaue of a series of future
More informationAsymptotic Properties of a Generalized Cross Entropy Optimization Algorithm
1 Asymptotic Properties of a Generaized Cross Entropy Optimization Agorithm Zijun Wu, Michae Koonko, Institute for Appied Stochastics and Operations Research, Caustha Technica University Abstract The discrete
More informationStochastic Variational Inference with Gradient Linearization
Stochastic Variationa Inference with Gradient Linearization Suppementa Materia Tobias Pötz * Anne S Wannenwetsch Stefan Roth Department of Computer Science, TU Darmstadt Preface In this suppementa materia,
More informationControl Chart For Monitoring Nonparametric Profiles With Arbitrary Design
Contro Chart For Monitoring Nonparametric Profies With Arbitrary Design Peihua Qiu 1 and Changiang Zou 2 1 Schoo of Statistics, University of Minnesota, USA 2 LPMC and Department of Statistics, Nankai
More informationPrimal and dual active-set methods for convex quadratic programming
Math. Program., Ser. A 216) 159:469 58 DOI 1.17/s117-15-966-2 FULL LENGTH PAPER Prima and dua active-set methods for convex quadratic programming Anders Forsgren 1 Phiip E. Gi 2 Eizabeth Wong 2 Received:
More informationAppendix for Stochastic Gradient Monomial Gamma Sampler
3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 4 5 6 7 8 9 3 3 3 33 34 35 36 37 38 39 4 4 4 43 44 45 46 47 48 49 5 5 5 53 54 Appendix for Stochastic Gradient Monomia Gamma Samper A The Main Theorem We provide the foowing
More informationDiscrete Techniques. Chapter Introduction
Chapter 3 Discrete Techniques 3. Introduction In the previous two chapters we introduced Fourier transforms of continuous functions of the periodic and non-periodic (finite energy) type, as we as various
More informationarxiv: v1 [cs.db] 1 Aug 2012
Functiona Mechanism: Regression Anaysis under Differentia Privacy arxiv:208.029v [cs.db] Aug 202 Jun Zhang Zhenjie Zhang 2 Xiaokui Xiao Yin Yang 2 Marianne Winsett 2,3 ABSTRACT Schoo of Computer Engineering
More informationInternational Journal of Mass Spectrometry
Internationa Journa of Mass Spectrometry 280 (2009) 179 183 Contents ists avaiabe at ScienceDirect Internationa Journa of Mass Spectrometry journa homepage: www.esevier.com/ocate/ijms Stark mixing by ion-rydberg
More information2-loop additive mass renormalization with clover fermions and Symanzik improved gluons
2-oop additive mass renormaization with cover fermions and Symanzik improved guons Apostoos Skouroupathis Department of Physics, University of Cyprus, Nicosia, CY-1678, Cyprus E-mai: php4as01@ucy.ac.cy
More informationA Statistical Framework for Real-time Event Detection in Power Systems
1 A Statistica Framework for Rea-time Event Detection in Power Systems Noan Uhrich, Tim Christman, Phiip Swisher, and Xichen Jiang Abstract A quickest change detection (QCD) agorithm is appied to the probem
More informationSchedulability Analysis of Deferrable Scheduling Algorithms for Maintaining Real-Time Data Freshness
1 Scheduabiity Anaysis of Deferrabe Scheduing Agorithms for Maintaining Rea-Time Data Freshness Song Han, Deji Chen, Ming Xiong, Kam-yiu Lam, Aoysius K. Mok, Krithi Ramamritham UT Austin, Emerson Process
More informationSchedulability Analysis of Deferrable Scheduling Algorithms for Maintaining Real-Time Data Freshness
1 Scheduabiity Anaysis of Deferrabe Scheduing Agorithms for Maintaining Rea- Data Freshness Song Han, Deji Chen, Ming Xiong, Kam-yiu Lam, Aoysius K. Mok, Krithi Ramamritham UT Austin, Emerson Process Management,
More informationLimits on Support Recovery with Probabilistic Models: An Information-Theoretic Framework
Limits on Support Recovery with Probabiistic Modes: An Information-Theoretic Framewor Jonathan Scarett and Voan Cevher arxiv:5.744v3 cs.it 3 Aug 6 Abstract The support recovery probem consists of determining
More informationAppendix of the Paper The Role of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Model
Appendix of the Paper The Roe of No-Arbitrage on Forecasting: Lessons from a Parametric Term Structure Mode Caio Ameida cameida@fgv.br José Vicente jose.vaentim@bcb.gov.br June 008 1 Introduction In this
More information4 1-D Boundary Value Problems Heat Equation
4 -D Boundary Vaue Probems Heat Equation The main purpose of this chapter is to study boundary vaue probems for the heat equation on a finite rod a x b. u t (x, t = ku xx (x, t, a < x < b, t > u(x, = ϕ(x
More informationAlgorithms to solve massively under-defined systems of multivariate quadratic equations
Agorithms to sove massivey under-defined systems of mutivariate quadratic equations Yasufumi Hashimoto Abstract It is we known that the probem to sove a set of randomy chosen mutivariate quadratic equations
More informationRelated Topics Maxwell s equations, electrical eddy field, magnetic field of coils, coil, magnetic flux, induced voltage
Magnetic induction TEP Reated Topics Maxwe s equations, eectrica eddy fied, magnetic fied of cois, coi, magnetic fux, induced votage Principe A magnetic fied of variabe frequency and varying strength is
More informationHaar Decomposition and Reconstruction Algorithms
Jim Lambers MAT 773 Fa Semester 018-19 Lecture 15 and 16 Notes These notes correspond to Sections 4.3 and 4.4 in the text. Haar Decomposition and Reconstruction Agorithms Decomposition Suppose we approximate
More informationVALIDATED CONTINUATION FOR EQUILIBRIA OF PDES
VALIDATED CONTINUATION FOR EQUILIBRIA OF PDES SARAH DAY, JEAN-PHILIPPE LESSARD, AND KONSTANTIN MISCHAIKOW Abstract. One of the most efficient methods for determining the equiibria of a continuous parameterized
More informationIterative Decoding Performance Bounds for LDPC Codes on Noisy Channels
Iterative Decoding Performance Bounds for LDPC Codes on Noisy Channes arxiv:cs/060700v1 [cs.it] 6 Ju 006 Chun-Hao Hsu and Achieas Anastasopouos Eectrica Engineering and Computer Science Department University
More information6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17. Solution 7
6.434J/16.391J Statistics for Engineers and Scientists May 4 MIT, Spring 2006 Handout #17 Soution 7 Probem 1: Generating Random Variabes Each part of this probem requires impementation in MATLAB. For the
More informationNearly Optimal Constructions of PIR and Batch Codes
arxiv:700706v [csit] 5 Jun 07 Neary Optima Constructions of PIR and Batch Codes Hia Asi Technion - Israe Institute of Technoogy Haifa 3000, Israe shea@cstechnionaci Abstract In this work we study two famiies
More informationarxiv: v2 [cs.lg] 4 Sep 2014
Cassification with Sparse Overapping Groups Nikhi S. Rao Robert D. Nowak Department of Eectrica and Computer Engineering University of Wisconsin-Madison nrao2@wisc.edu nowak@ece.wisc.edu ariv:1402.4512v2
More informationApproximation and Fast Calculation of Non-local Boundary Conditions for the Time-dependent Schrödinger Equation
Approximation and Fast Cacuation of Non-oca Boundary Conditions for the Time-dependent Schrödinger Equation Anton Arnod, Matthias Ehrhardt 2, and Ivan Sofronov 3 Universität Münster, Institut für Numerische
More informationSubstructuring Preconditioners for the Bidomain Extracellular Potential Problem
Substructuring Preconditioners for the Bidomain Extraceuar Potentia Probem Mico Pennacchio 1 and Vaeria Simoncini 2,1 1 IMATI - CNR, via Ferrata, 1, 27100 Pavia, Itay mico@imaticnrit 2 Dipartimento di
More informationc 2016 Georgios Rovatsos
c 2016 Georgios Rovatsos QUICKEST CHANGE DETECTION WITH APPLICATIONS TO LINE OUTAGE DETECTION BY GEORGIOS ROVATSOS THESIS Submitted in partia fufiment of the requirements for the degree of Master of Science
More informationXSAT of linear CNF formulas
XSAT of inear CN formuas Bernd R. Schuh Dr. Bernd Schuh, D-50968 Kön, Germany; bernd.schuh@netcoogne.de eywords: compexity, XSAT, exact inear formua, -reguarity, -uniformity, NPcompeteness Abstract. Open
More informationAn approximate method for solving the inverse scattering problem with fixed-energy data
J. Inv. I-Posed Probems, Vo. 7, No. 6, pp. 561 571 (1999) c VSP 1999 An approximate method for soving the inverse scattering probem with fixed-energy data A. G. Ramm and W. Scheid Received May 12, 1999
More informationPHYS 110B - HW #1 Fall 2005, Solutions by David Pace Equations referenced as Eq. # are from Griffiths Problem statements are paraphrased
PHYS 110B - HW #1 Fa 2005, Soutions by David Pace Equations referenced as Eq. # are from Griffiths Probem statements are paraphrased [1.] Probem 6.8 from Griffiths A ong cyinder has radius R and a magnetization
More informationAsymptotic Gains of Generalized Selection Combining
Asymptotic Gains of Generaized Seection Combining Yao a Z Wang S Pasupathy Dept of ECpE Dept of ECE Dept of ECE Iowa State Univ Iowa State Univ Univ of Toronto, Emai: mayao@iastateedu Emai: zhengdao@iastateedu
More informationAppendix for Stochastic Gradient Monomial Gamma Sampler
Appendix for Stochastic Gradient Monomia Gamma Samper A The Main Theorem We provide the foowing theorem to characterize the stationary distribution of the stochastic process with SDEs in (3) Theorem 3
More informationRate-Distortion Theory of Finite Point Processes
Rate-Distortion Theory of Finite Point Processes Günther Koiander, Dominic Schuhmacher, and Franz Hawatsch, Feow, IEEE Abstract We study the compression of data in the case where the usefu information
More informationInductive Bias: How to generalize on novel data. CS Inductive Bias 1
Inductive Bias: How to generaize on nove data CS 478 - Inductive Bias 1 Overfitting Noise vs. Exceptions CS 478 - Inductive Bias 2 Non-Linear Tasks Linear Regression wi not generaize we to the task beow
More informationAn Algorithm for Pruning Redundant Modules in Min-Max Modular Network
An Agorithm for Pruning Redundant Modues in Min-Max Moduar Network Hui-Cheng Lian and Bao-Liang Lu Department of Computer Science and Engineering, Shanghai Jiao Tong University 1954 Hua Shan Rd., Shanghai
More informationMore Scattering: the Partial Wave Expansion
More Scattering: the Partia Wave Expansion Michae Fower /7/8 Pane Waves and Partia Waves We are considering the soution to Schrödinger s equation for scattering of an incoming pane wave in the z-direction
More informationarxiv: v3 [math.ca] 8 Nov 2018
RESTRICTIONS OF HIGHER DERIVATIVES OF THE FOURIER TRANSFORM MICHAEL GOLDBERG AND DMITRIY STOLYAROV arxiv:1809.04159v3 [math.ca] 8 Nov 018 Abstract. We consider severa probems reated to the restriction
More informationStatistics for Applications. Chapter 7: Regression 1/43
Statistics for Appications Chapter 7: Regression 1/43 Heuristics of the inear regression (1) Consider a coud of i.i.d. random points (X i,y i ),i =1,...,n : 2/43 Heuristics of the inear regression (2)
More informationReconstructions that Combine Cell Average Interpolation with Least Squares Fitting
App Math Inf Sci, No, 7-84 6) 7 Appied Mathematics & Information Sciences An Internationa Journa http://dxdoiorg/8576/amis/6 Reconstructions that Combine Ce Average Interpoation with Least Squares Fitting
More information