On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization

Size: px
Start display at page:

Download "On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization"

Transcription

1 JMLR: Workshop an Conference Proceeings vol ) 1 On the Complexity of Banit an Derivative-Free Stochastic Convex Optimization Oha Shamir Microsoft Research an the Weizmann Institute of Science oha.shamir@weizmann.ac.il Abstract he problem of stochastic convex optimization with banit feeback in the learning community) or without knowlege of graients in the optimization community) has receive much attention in recent years, in the form of algorithms an performance upper bouns. However, much less is known about the inherent complexity of these problems, an there are few lower bouns in the literature, especially for nonlinear functions. In this paper, we investigate the attainable error/regret in the banit an erivative-free settings, as a function of the imension an the available number of queries. We provie a precise characterization of the attainable performance for strongly-convex an smooth functions, which also imply a non-trivial lower boun for more general problems. Moreover, we prove that in both the banit an erivative-free setting, the require number of queries must scale at least quaratically with the imension. Finally, we show that on the natural class of quaratic functions, it is possible to obtain a fast O1/ ) error rate in terms of, uner mil assumptions, even without having access to graients. o the best of our knowlege, this is the first such rate in a erivative-free stochastic setting, an hols espite previous results which seem to imply the contrary. Keywors: Stochastic Convex Optimization; Derivative-Free Optimization; Banit Convex Optimization; Regret 1. Introuction his paper consiers the following funamental question: Given an unknown convex function F, an the ability to query for possibly noisy) realizations of its values at various points, how can we optimize F with as few queries as possible? his question, uner ifferent guises, has playe an important role in several communities. In the optimization community, this is usually known as zeroth-orer or erivativefree convex optimization, since we only have access to function values rather than graients or higher-orer information. he goal is to return a point with small optimization error on some convex omain, using a limite number of queries. Derivative-free methos were among the earliest algorithms to numerically solve unconstraine optimization problems, an have recently enjoye increasing interest, being especial useful in black-box situations where graient information is har to compute or oes not exist Nesterov 011); Stich et al. 011). In a stochastic framework, we can only obtain noisy realizations of the function values for instance, ue to running the optimization process on sample ata). We refer to this setting as erivative-free SCO short for stochastic convex optimization). In the learning community, these kins of problems have been closely stuie in the context of multi-arme banits an more generally) banit online optimization, which are 013 O. Shamir.

2 Shamir powerful moels for sequential ecision making uner uncertainty Cesa-Bianchi an Lugosi 006); Bubeck an Cesa-Bianchi 01). In a stochastic framework, these settings correspon to repeately choosing points in some convex omain, obtaining noisy realizations of some unerlying convex function s value. However, rather than minimizing optimization error, our goal is to minimize the average) regret: roughly speaking, that the average of the function values we obtain is not much larger than the minimal function value. For example, the well-known multi-arme banit problem correspons to a linear function over the simplex. We refer to this setting as banit SCO. As will be more explicitly iscusse later on, any algorithm which attains small average regret can be converte to an algorithm with the same optimization error. In other wors, banit SCO is only harer than erivative-free SCO. We note that in the context of stochastic multi-arme banits, the potential gap between the two settings uner the terms cumulative regret an simple regret ) was introuce an stuie in Bubeck et al. 011). When one is given graient information, the attainable optimization error / average regret is well-known: uner mil conitions, it is Θ1/ ) for convex functions an Θ1/ ) for strongly-convex functions, where is the number of queries Zinkevich 003); Hazan an Kale 011); Rakhlin et al. 01). Note that these bouns o not explicitly epen on the imension of the omain. he inherent complexity of banit/erivative-free SCO is not as well-unerstoo. An important exception is multi-arme banits, where the attainable error/regret is known to be exactly Θ / ), where is the imension an is the number of queries 1 Auer et al. 00); Auibert an Bubeck 009). Linear functions over other convex omains has also been explore, with upper bouns on the orer of O / ) to O / ) e.g. Abbasi-Yakori et al. 011); Bubeck et al. 01)). For linear functions over general omains, information-theoretic Ω / ) lower bouns have been proven in Dani et al. 007, 008); Auibert et al. 011). However, these lower bouns are either on the regret not optimization error); shown for non-convex omains; or are implicit an rely on artificial, carefully constructe omains. In contrast, we focus here on simple, natural omains an convex problems. When ealing with more general, non-linear functions, much less is known. he problem was originally consiere over 30 years ago, in the seminal work by Yuin an Nemirovsky on the complexity of optimization Nemirovsky an Yuin 1983). he authors provie some algorithms an upper bouns, but as they themselves emphasize cf. pg. 359), the attainable complexity is far from clear. Quite recently, Jamieson et al. 01) provie an Ω / ) lower boun for strongly-convex functions, which emonstrates that the fast O1/ ) rate in terms of, that one enjoys with graient information, is not possible here. In contrast, the current best-known upper bouns are O 4 / ), O 3 / ), O / ) for convex, strongly-convex, an strongly-convex-an-smooth functions respectively Flaxman et al. 005); Agarwal et al. 010)); An a O 3 / ) boun for convex functions Agarwal 1. In a stochastic setting, a more common boun in the literature is O log )/ ), but the O-notation hies a non-trivial epenence on the form of the unerlying linear function in multi-arme banits terminology, a gap between the expecte rewars boune away from 0). Such assumptions are not natural in a nonlinear banits SCO setup, an without them, the regret is inee Θ / ). See for instance Bubeck an Cesa-Bianchi, 01, Chapter ) for more etails.

3 Complexity of Banit an Derivative-Free Stochastic Convex Optimization et al. 011)), which is better in terms of epenence on but very ba in terms of the imension. In this paper, we investigate the complexity of banit an erivative-free stochastic convex optimization, focusing on nonlinear functions, with the following contributions see also the summary in able 1): ˆ We prove that for strongly-convex an smooth functions, the attainable error/regret is exactly Θ / ). his has three important ramifications: First of all, it settles the question of attainable performance for such functions, an is the first sharp characterization of complexity for a general nonlinear banit/erivative-free class of problems. Secon, it proves that the require number of queries in such problems must scale quaratically with the imension, even in the easier optimization setting, an in contrast to the linear case which often allows linear scaling with the imension. hir, it formally provies a natural Ω / ) lower boun for more general classes of convex problems. ˆ We analyze an important special case of strongly-convex an smooth functions, namely quaratic functions. We show that for such functions, one can efficiently) attain Θ / ) optimization error, an that this rate is sharp. o the best of our knowlege, it is the first general class of nonlinear functions for which one can show a fast rate in terms of ) in a erivative-free stochastic setting. In fact, this may seem to contraict the result in Jamieson et al. 01), which shows an Ω / ) lower boun on quaratic functions. However, as we explain in more etail later on, there is no contraiction, since the example establishing the lower boun of Jamieson et al. 01) imposes an extremely small omain which actually ecays with ), while our result hols for a fixe omain. Although this result is tight, we also show that uner more restrictive assumptions on the noise process, it is sometimes possible to obtain better error bouns, as goo as O/ ). ˆ We prove that even for quaratic functions, the attainable average regret is exactly Θ / ), in contrast to the Θ / ) result for optimization error. his shows there is a real gap between what can be obtaine for erivative-free SCO an banit SCO, without any specific istributional assumptions. Again, this stans in contrast to settings such as multi-arme banits, where there is no ifference in their istributionfree performance. We emphasize that our upper bouns are base on the assumption that the function minimizer is boune away from the omain bounary, or that we can query points slightly outsie the omain. However, we argue that this assumption is not very restrictive in the context of strongly-convex functions especially in learning applications), where the omain is often R, an a minimizer always exists. he paper is structure as follows: In Sec., we formally efine the setup an introuce the notation we shall use in the remainer of the paper. For clarity of exposition, we begin with the case of quaratic functions in Sec. 3, proviing algorithms, upper an lower bouns. he tools an insights we evelop for the quaratic case will allow us to tackle the more general strongly-convex-an-smooth setting in Sec. 4. We en the main part of the paper with a summary an iscussion of open problems in Sec. 5. In Appenix A, we emonstrate 3

4 Shamir Optimization Error Average Regret Function ype O ) Ω ) O ) Ω ) Quaratic Str. Convex an Smooth Str. Convex Convex { 3 min, { 4 min, } 3 3 } { 3 min, { 4 min, } 3 3 } able 1: A summary of the complexity upper bouns O )) an lower bouns Ω )), for erivative-free stochastic convex optimization optimization error) an banit stochastic convex optimization average regret), for various function classes, in terms of the imension an the number of queries. he boxe results are shown in this paper. he upper bouns for the convex an strongly convex case combine results from Flaxman et al. 005); Agarwal et al. 010, 011). he table shows epenence on, only an ignores other factors an constants. that one can obtain improve performance in the quaratic case, if we re consiering more specific natural noise processes. Aitional proofs are presente in Appenix B.. Preliminaries Let enote the stanar Eucliean norm. We let F ) : W R enote the convex function of interest, where W R is a close) convex omain. We say that F is λ- strongly convex, for λ > 0, if for any w, w W an any subgraient g of F at w, it hols that F w ) F w) + g, w w + λ w w. Intuitively, this means that we can lower boun F everywhere by a quaratic function of fixe curvature. We say that F is µ-smooth if for any w, w W, an any subgraient g of F at w, it hols that F w ) F w) + g, w w + µ w w. Intuitively, this means that we can upperboun F everywhere by a quaratic function of fixe curvature. We let w W enote a minimizer of F on w. o prevent trivialities, we consier in this paper only functions whose optimum w is known beforehan to lie in some boune omain even if W is large or all of R ), an the function is Lipschitz in that omain. he learning/optimization process procees in rouns. Each roun t, we pick an query a point w t W, obtaining an inepenent realization of F w) + ξ w, where ξ w is an unknown zero-mean ranom variable, such that E[ξw max { 1, w }. In the banit. We note that this slightly eviates from the more common assumption in the banits/erivative-free SCO setting that E[ξ w O1). While such assumptions are equivalent for boune W, we also wish to consier cases with unrestricte omains W = R. In that case, assuming E[ξ w O1) may lea to trivialities in the erivative-free setting. For example, consier the case where F w) = w Aw + b w. hen for any w an any ξ w with uniformly boune variance, we can get a virtually noiseless estimate 4

5 Complexity of Banit an Derivative-Free Stochastic Convex Optimization SCO setting, our goal is to minimize the expecte average regret, namely [ 1 E F w t ) F w ), whereas in the erivative-free SCO setting, our goal is to compute, base on w 1,..., w an the observe values, some point w W, such that the expecte optimization error E [F w ) F w ), is as small as possible. We note that given a banit SCO algorithm with some regret boun, one can get a erivative-free SCO algorithm with the same optimization error boun: we simply run the stochastic banit algorithm, getting w 1,..., w, an returning 1 w t. By Jensen s inequality, the expecte optimization error is at most the expecte average regret with respect to w 1,..., w. hus, banit SCO is only harer than erivative-free SCO. In this paper, we provie upper an lower bouns on the attainable optimization error / average regret, as a function of the imension an the number of rouns/queries. For simplicity, we focus here on bouns which hol in expectation, an an interesting point for further research is to exten these to bouns on the actual error/regret, which hol with high probability. 3. Quaratic Functions In this section, we consier the class of quaratic functions, which have the form F w) = w Aw + b w + c where A is positive-efinite with a minimal eigenvalue boune away from 0). Moreover, to make the problem well-behave, we assume that A has a spectral norm of at most 1, an that b 1, c 1. We note that if the norms are boune but larger than 1, this can be easily hanle by rescaling the function. It is easily seen that such functions are both strongly convex an smooth. Moreover, this is a natural an important class of functions, which in learning applications appears, for instance, in the context of least squares an rige regression. Besies proviing new insights for this class, we will use the techniques evelope here later on, in the more general case of strongly-convex an smooth functions Upper Bouns We begin by showing that for erivative-free SCO, one can obtain an optimization error boun of O / ). o the best of our knowlege, this is the first example of a erivative-free stochastic boun scaling as O1/ ) for a general class of nonlinear functions, as oppose to O1/ ). However, to achieve this result, we nee to make the following mil assumption: of w Aw by picking w = cw for some large c an computing 1 c F w ) + ξ w ). Variants of this iea will also allow virtually noiseless estimates of the linear term. 5

6 Shamir Assumption 1 At least one of the following hols for some fixe ɛ 0, 1: ˆ he quaratic function attains its minimum w in the omain W, an the Eucliean istance of w from the omain bounary is at least ɛ. ˆ We can query not just points in W, but any point whose istance from W is at most ɛ. With strongly-convex functions, the most common case is that W = R, an then both cases actually hol for any value of ɛ. Even in other situations, one of these assumptions virtually always hols. Note that we crucially rely here on the strong-convexity assumption: with say) linear functions, the omain must always be boune an the optimum always lies at the bounary of the omain. With this assumption, the boun we obtain is on the orer of /ɛ. As iscusse earlier, Jamieson et al. 01) recently prove a Ω / ) lower boun for erivative-free SCO, which actually applies to quaratic functions. his oes not contraict our result, since in their example the iameter of W an hence also ɛ) ecays with. In contrast, our O / ) boun hols for fixe ɛ, which we believe is natural in most applications. o obtain this behavior, we utilize a well-known 1-point graient estimate technique, which allows us to get an unbiase estimate of the graient at any point by ranomly querying for a noisy) value of the function aroun it see Nemirovsky an Yuin 1983); Flaxman et al. 005)). Our key insight is that whereas for general functions one must query very close to the point of interest scaling to 0 with ), quaratic functions have aitional structure which allows us to query relatively far away, allowing graient estimates with much smaller variance. he algorithm we use is presente as Algorithm 1, an is computationally efficient. It uses a moification W of the omain W, efine as follows. First, we let B enote some known upper boun on w. If the first alternative of assumption 1 hols, then W consists of all points in W {w : w B}, whose istance from W s bounary is at least ɛ. If the secon alternative hols, then W = W {w : w B}. Note that uner any alternative, it hols that W is convex, that w t B, that w W, an that our algorithm always queries at legitimate points. In the pseuocoe, we use Π W to enote projection on W. For simplicity, we assume that / is an integer an that W inclues the origin 0. Algorithm 1 Derivative-Free SCO Algorithm for Strongly-Convex Quaratic Functions Input: Strong convexity parameter λ > 0; Distance parameter ɛ 0, 1 Initialize w 1 = 0. for t = 1,..., 1 o Pick r { 1, +1} uniformly at ranom Query noisy function value v at point w t + ɛ r v Let g = ɛ r Let w t+1 = Π W wt 1 λt g) en for Return w = t=/ w t. he following theorem quantifies the optimization error of our algorithm. 6

7 Complexity of Banit an Derivative-Free Stochastic Convex Optimization heorem 1 Let F w) = w Aw+b w+c be a λ-strongly convex function, where A, b, c are all at most 1, an suppose the optimum w has a norm of at most B. hen uner Assumption 1, the point w returne by Algorithm 1 satisfies E [F w ) F w log))b + 1)4 ) λɛ. Note that returning w as the average over the last / iterates as oppose to averaging over all iterates) is necessary to avoi log ) factors Rakhlin et al. 01). As an interesting sie-note, we conjecture that a graient-base approach is crucial here to obtain O1/ ) rates in terms of ). For example, a ifferent family of erivative-free methos see for instance Nemirovsky an Yuin 1983); Agarwal et al. 011); Jamieson et al. 01)) is base on a type of noisy binary search, where a few strategically selecte points are repeately sample in orer to estimate which of them has a larger/smaller function value. his is use to shrink the feasible region where the optimum w might lie. Since it is generally impossible to estimate the mean of noisy function values at a rate better than O1/ ), it is not clear if one can get an optimization rate faster than O1/ ) with such methos. he proof of the theorem relies on the following key lemma, whose proof appears in the appenix. Lemma For any w t, we have that an E r,v [ g = F w t ) E r,v [ g 4 B + 1) 4 ɛ. his lemma implies that Algorithm 1 essentially performs stochastic graient escent over the strongly-convex function F w), where the graient estimates are unbiase an with boune secon moments. he returne point is a suffix-average of the last / iterates. Using a convergence analysis for stochastic graient escent with suffix-averaging Rakhlin et al., 01, heorem 5), an plugging in the bouns of Lemma, we get hm Lower Bouns In this subsection, we prove that the upper boun obtaine in hm. 1 is essentially tight: namely, up to constants, the worst-case error rate one can obtain for erivative-free SCO of quaratic functions is orer of /. Besies showing that the algorithm above is essentially optimal, it implies that even for extremely nice strongly-convex functions an omains, the number of queries require to reach some fixe accuracy scales quaratically with the imension. his stans in contrast to the case of linear functions, where the provable query complexity often scales linearly with. heorem 3 Let the number of rouns be fixe. hen for any possibly ranomize) querying strategy, there exists a quaratic function of the form F w) = 1 w e, w, which is minimize at e where e 1, such that the resulting w satisfies } E[F w ) F w ) 0.01 min {1,. 7

8 Shamir Note that since e 1, we know in avance that the optimum must lie in the unit Eucliean ball. Despite this, the lower boun hols even if we o not restrict at all the omain in which we are allowe to query - i.e., it can even be all of R. Proof he proof technique is inspire by a lower boun which appears in Arias-Castro et al. 011), in the ifferent context of compresse sensing. he argument also bears some close similarities to the proof of Assoua s lemma see Cybakov 009)). We will exhibit a istribution over quaratic functions F, such that in expectation over this istribution, any querying strategy will attain Ω / ) optimization error. his implies that for any querying strategy, there exists some eterministic F for which it will have this amount of error. he functions we shall consier are F e w) = 1 w e, w, where e is rawn uniformly from { µ, µ}, with µ 0, 1/ ) being a parameter to be specifie later. Moreover, we will assume that the noise ξ w is a Gaussian ranom variable with zero mean an stanar eviation max { 1, w }. By efinition of 1-strong convexity, it is easy to verify that F e w) F e e) 1 w e. hus, the expecte optimization error over the querying strategy) is at least [ 1 E[F e w ) F e e) E w e E [ 1 w i e i ) E [ µ 1 wi e i <0. 1) We will assume that the querying strategy is eterministic: w t is a eterministic function of the previous query values v 1, v,..., v t 1 at w 1,..., w t 1. his assumption is without loss of generality, since any ranom querying strategy can be seen as a ranomization over eterministic querying strategy. hus, a lower boun which hols uniformly for any eterministic querying strategy woul also hol over a ranomization. o lower boun Eq. 1), we use the following key lemma, which relates this to the question of how informative are the query values as measure by Kullback-Leibler or KL ivergence) for etermining the sign of e s coorinates. Intuitively, the more similar the query values are, the smaller is the KL ivergence an the harer it is to istinguish the true sign of each e i, leaing to a larger lower boun. he proof appears in the appenix. Lemma 4 Let e be a ranom vector, none of whose coorinates is supporte on 0, an let v 1, v,..., v be a sequence of query values obtaine by a eterministic strategy returning a point w so that the query location w t is a eterministic function of v 1,..., v t 1, an w is a eterministic function of v 1,..., v ). hen we have [ E 1 wi e i <0 1 1 U t,i, where U t,i = sup D kl Pr vt e i > 0, {e j } j i, {v l } t 1 ) l=1 Pr vt e i < 0, {e j } j i, {v l } t 1 )) l=1 {e j } j i an D kl represents the KL ivergence between two istributions. 8

9 Complexity of Banit an Derivative-Free Stochastic Convex Optimization Using Lemma 4, we can get a lower boun for the above, provie an upper boun on the U t,i s. o analyze this, consier any fixe values of {e j } j i, an any fixe values of v 1,..., v t 1. Since the querying strategy is assume to be eterministic, it follows that w t is uniquely etermine. Given this w t, the function value v t equals F e w t ) = 1 w t + e j w t,j + µw t,i + ξ wt ) j i conitione on e i > 0, an F e w t ) = 1 w t + e j w t,j µw t,i + ξ wt 3) j i conitione on e i < 0. Comparing Eq. ) an Eq. 3), we notice that they both represent a Gaussian istribution ue to the ξ wt noise term), with stanar eviation max { 1, w t } an means seperate by µw t,i. o boun the ivergence, we use the following stanar result on the KL ivergence between two Gaussians Kullback 1959): Lemma 5 Let N µ, σ ) represent a Gaussian istribution variable with mean µ an variance σ. hen D kl N µ1, σ ) N µ, σ ) ) = µ 1 µ ) σ Using this lemma, it follows that D kl P v t v 1,..., v t 1 ) Qv t v 1,..., v t 1 )) µw t,i ) max {1, w t 4 } = µ w t,i max {1, w t 4 }. Plugging this upper boun on the U t,i s in Lemma 4, we can further lower boun on the expecte optimization error from Eq. 1) by µ 1 1 µ w t,i 4 max {1, w t 4 = µ 1 µ w t } 4 max {1, w t 4 } = µ { } ) 1 µ min w t 4 1, w t µ µ 1. 4) 4 Finally, we choose µ = min{1/, /4 }, an obtain a lower boun of ) } } min {1, > 0.01 min {1, 4 4 as require. he theorem above applies to the optimization error for erivative-free SCO. We now turn to eal with the case of banit SCO an regret, showing an Ω / ) lower boun. 9

10 Shamir Since the erivative-free SCO boun was Θ / ), the result implies a real gap between what can be obtaine in terms of average regret, as oppose to optimization error, without any specific istributional assumptions. his stans in contrast to settings such as multiarme banits, where the construction implying the known Ω / ) lower boun e.g. Cesa-Bianchi an Lugosi 006)) applies equally well to erivative-free an banit SCO see Bubeck et al. 011)). heorem 6 Let the number of rouns be fixe. hen for any possibly ranomize) querying strategy, there exists a quaratic function of the form F w) = 1 w e, w, which is minimize at e where e 1/, such that E [ 1 { } F w t ) F w ) 0.0 min 1,. Note that our lower boun hols even when the omain is unrestricte the algorithm can pick any point in R ). Moreover, the lower boun coincies up to a constant) with the O / ) regret upper-boun shown for strongly-convex an smooth functions in Agarwal et al. 010). his shows that for strongly-convex an smooth functions, the minimax average regret is Θ / ). Also, the lower boun implies that one cannot hope to obtain average regret better than / for more general banit problems, such as strongly-convex or even convex problems. he proof relies on techniques similar to the lower boun of hm. 3, with a key aitional insight. Specifically, in hm. 3, the lower boun obtaine actually epens on the norm of the points w 1,..., w see Eq. 4)), an the optimal w has a very small norm. In a regret minimization setting the points w 1,..., w cannot be too far from w, an thus must have a small norm as well, leaing to a stronger lower boun than that of hm. 3. he formal proof appears in the appenix. 4. Strongly Convex an Smooth Functions We now turn to the more general case of strongly convex an smooth functions. First, we note that in the case of functions which are both strongly convex an smooth, Agarwal et al., 010, heorem 14) alreay provie an O / ) average regret boun which hols even in a non-stochastic setting). he main result of this section is a matching lower boun, which hols even if we look at the much easier case of erivative-free SCO. his lower boun implies that the attainable error for strongly-convex an smooth functions is orer of /, an at least / for any harer setting. heorem 7 Let the number of rouns be fixe. hen for any possibly ranomize) querying strategy, there exists a function F over R which is 0.5-strongly convex an 3.5- smooth; Is 4-Lipschitz over the unit Eucliean ball; has a global minimum in the unit ball; An such that the resulting w satisfies { } E[F w ) F w ) min 1,. 10

11 Complexity of Banit an Derivative-Free Stochastic Convex Optimization Note that we mae no attempt to optimize the constant. he general proof technique is rather similar to that of hm. 3, but the construction is a bit more intricate. Specifically, letting µ > 0 be a parameter to be etermine later, we look at functions of the form F e w) = w e i w i 1 + w i /e i ), where e is uniformly istribute on { µ, +µ}. o see the intuition behin this choice, let us consier the one-imensional case = 1). Recall that in the quaratic setting, the function we consiere in one imension) was of the form F e w) = 1 w ew, where e was chosen uniformly at ranom from { µ, +µ}, an µ is a small number. hus, the optimum is at either µ or µ, an the ifference F µ w) F µ w) at these optima is orer of µ. However, by picking w = Θ1), the ifference F µ w) F µ w) is on the orer of µ - much larger than the ifference close to the optimum, which is orer of µ. herefore, by querying for w far from the optimum, an getting noisy values of F e, it is easier to istinguish whether we are ealing with e = +µ or e = µ, leaing to a / optimization error boun. In contrast, the function we consier here in the one-imensional case) is of the form F e w) = w ew 1 + w/e). 5) his form is carefully esigne so that F µ w) F µ w) is orer of µ, not just at the optima of F µ an F µ, but for all w. his is because of the aitional enominator, which makes the function closer an closer to w the larger w is - see Fig. 1 for a graphical illustration. As a result, no matter how the function is querie, istinguishing the choice of µ is ifficult, leaing to the strong lower boun of hm. 7. A formal proof is presente in the appenix. 5. Discussion In this paper, we consiere the ual settings of banit an erivative-free stochastic convex optimization. We provie a sharp characterization of the attainable performance for strongly-convex an smooth functions. he results also provie useful lower-bouns for more general settings. We also consiere the case of quaratic functions, showing that a fast O1/ ) rate is possible in a stochastic setting, even without knowlege of erivatives. Our results have several qualitative ifferences compare to previously known results which focus on linear functions, such as quaratic epenence on the imension even for extremely nice functions, an a provable gap between the attainable performance in banit optimization an erivative-free optimization. Our work leaves open several questions. For example, we have only ealt with bouns which hol in expectation, an our lower bouns focuse on the epenence on,, where other problem parameters, such as the Lipschitz constant an strong convexity parameter, are fixe constants. While this follows the setting of previous works, it oes not cover 11

12 Shamir Figure 1: he two soli blue lines represents F e w) as in Eq. 5), for e = 0.1 an e = 0.1, whereas the two ashe black lines represent two quaratic functions with similar minimum points. Close to the minima, F e w) an the quaratic functions behave rather similarly. However, as we increase w, the two quaratic functions become rather istinguishable, whereas F e w) become more an more inistinguishable for the two choices of e. hus, istinguishing whether e = 0.1 or e = 0.1, base only on function values is of F e w), is much harer than the quaratic case situations where these parameters scale with. Finally, while this paper settles the case of strongly-convex an smooth functions, we still on t know what is the attainable performance for general convex functions, as well as the more specific case of strongly-convex ) possibly non-smooth) functions. Our Ω / lower boun still hols, but the existing upper bouns are much larger: min /, } { 4 3 / for convex functions, an { 3 min /, } 3 / for strongly-convex functions see table 1). We on t know if the lower boun or the existing upper bouns are tight. However, it is the current upper bouns which seem less natural, an we suspect that they are the ones that can be consierably improve, using new algorithms which remain uniscovere. Acknowlegments We thank John Duchi, Satyen Kale, Robi Krauthgamer an the anonymous reviewers for helpful iscussions an comments. References Y. Abbasi-Yakori, D. Pál, an C. Szepesvári. Improve algorithms for linear stochastic banits. In NIPS,

13 Complexity of Banit an Derivative-Free Stochastic Convex Optimization A. Agarwal, O. Dekel, an L. Xiao. Optimal algorithms for online convex optimization with multi-point banit feeback. In COL, 010. A. Agarwal, D. Foster, D. Hsu, S. Kakae, an A. Rakhlin. Stochastic convex optimization with banit feeback. In NIPS, 011. E. Arias-Castro, E. Canès, an M. Davenport. On the funamental limits of aaptive sensing. CoRR, abs/ , 011. J.-Y. Auibert an S. Bubeck. Minimax policies for aversarial an stochastic banits. In COL, 009. J.-Y. Auibert, S. Bubeck, an G. Lugosi. Minimax policies for combinatorial preiction games. COL, 011. P. Auer, N. Cesa-Bianchi, Y. Freun, an R. Schapire. he nonstochastic multiarme banit problem. SIAM J. Comput., 31):48 77, 00. S. Bubeck an N. Cesa-Bianchi. Regret analysis of stochastic an nonstochastic multi-arme banit problems. CoRR, abs/ , 01. S. Bubeck, R. Munos, an G. Stoltz. Pure exploration in finitely-arme an continuousarme banits. heoretical Computer Science, 4119): , 011. S. Bubeck, N. Cesa-Bianchi, an S. Kakae. owars minimax policies for online linear optimization with banit feeback. In COL, 01. N. Cesa-Bianchi an G. Lugosi. Preiction, learning, an games. Cambrige University Press, Cover an J. homas. Elements of information theory. Wiley, eition, 006. A.B. Cybakov. Introuction to nonparametric estimation. Springer series in statistics. Springer, 009. V. Dani,. Hayes, an S. Kakae. he price of banit information for online optimization. In NIPS, 007. V. Dani,. Hayes, an S. Kakae. Stochastic linear optimization uner banit feeback. In COL, 008. A. Flaxman, A. Kalai, an B. McMahan. Online convex optimization in the banit setting: graient escent without a graient. In SODA, 005. E. Hazan an S. Kale. Beyon the regret minimization barrier: an optimal algorithm for stochastic strongly-convex optimization. In COL, 011. K. Jamieson, R. Nowak, an B. Recht. Query complexity of erivative-free optimization. CoRR, abs/ , 01. S. Kullback. Information heory an Statistics. Dover,

14 Shamir A. Nemirovsky an D. Yuin. Problem Complexity an Metho Efficiency in Optimization. Wiley-Interscience, Y. Nesterov. Ranom graient-free minimization of convex functions. echnical Report 16, ECORE Discussion Paper, 011. A. Rakhlin, O. Shamir, an K. Sriharan. Making graient escent optimal for strongly convex stochastic optimization. In ICML, 01. S. Stich, C. Müller, an B. Gärtner. Optimization of convex functions with ranom pursuit. CoRR, abs/ , 011. M. Zinkevich. Online convex programming an generalize infinitesimal graient ascent. In ICML, 003. Appenix A. Improve Results for Quaratic Functions In Sec. 3, we showe a tight Θ / ) boun on the achievable error for quaratic functions, in the erivative-free SCO setting. his was shown uner the assumption that the noise ξ w is zero-mean an has a secon moment boune by max{1, w }. In this appenix, we show how uner aitional natural assumptions on the noise, one can improve on this result with an efficient algorithm. he main message here is not so much the algorithmic result, but rather to show that the generic noise assumption is important for our lower bouns, an that better algorithms may still be possible for more specific settings. o give a concrete example, consier the classic setting of rige regression, where we have labele training examples x, y) sample i.i.. from some istribution over R R, an our goal is to fin some w R minimizing F w) = λ [ ) w + E x,y) w x y. In a banit / erivative-free SCO setting, we can think of each query as giving as the value of ˆF w) = λ w + w x y). 6) for some specific example x, y), an note that its expecte value over the ranom raw of x, y)) equals F w). hus, it falls within the setting consiere in this paper. However, the noise process is not generic, but has a particular structure. We will show here that one can actually attain an error rate as goo as O/ ) for this problem. o formally present our result, it woul be useful to consier a more general setting, the rige regression setting above being a special case. Suppose we can write F w) as E[ ˆF w), where ˆF w) ecomposes into a eterministic term Rw) an a stochastic quaratic term Ĝw): ˆF w) = Rw) + w Ĝw) = Rw) + Âw + ˆb ) w + ĉ, where Â, ˆb, ĉ are ranom variables. We assume that whenever we query a point w, we get ˆF w) for some ranom realization of Â, ˆb, ĉ. In general, Rw) can be a strongly-convex regularization term, such as λ w in Eq. 6). 14

15 Complexity of Banit an Derivative-Free Stochastic Convex Optimization he algorithm we consier, Algorithm, is a slight variant of Algorithm 1, which takes this ecomposition of F w) into account when constructing its unbiase graient estimate. Compare to Algorithm 1, this algorithm also queries at ranom points further away from w t, up to a istance of. We will assume here that we can always query at such points 3. We also let W = W {w : w B} in the algorithm, where we recall that B is some known upper boun on w. Algorithm Derivative-Free SCO Algorithm for Decomposable-Quaratic Functions Input: Deterministic term R ); Strong convexity parameter λ > 0 Initialize w 1 = 0. for t = 1,..., 1 o Pick r { 1, +1} uniformly at ranom Query noisy function value v at point w t + r Let g = v R w t + r)) r + g R w t ), where g R w) is a subgraient of R ) at w Let w t+1 = Π W wt 1 λt g) en for Return w = w t=/ w t. We now show that with this algorithm, one can improve on our O / ) error upper boun from hm. 1). heorem 8 In the setting escribe above, suppose Â, ˆb, ĉ are all at most 1 with probability 1, the optimum w has a norm of at most B, an g R w) N for any w W. hen uner Assumption 1, the point w returne by Algorithm satisfies E [F w ) F w ) where F is the Frobenius norm. N log)) B + 1) 4 + E λ [  F Note that if we only assume  1, then  F can be as high as, which leas to an O / ) boun, same as in hm. 1. However, it may be much smaller than that. In particular, for the rige regression case we consiere earlier,  correspons to xx where x is a ranomly rawn instance. Uner the common assumption that x O1) inepenent of the imension), it follows that xx F = x 4 = O1). herefore,  F is inepenent of the imension, leaing to an O/ ) error upper boun in terms of,. We remark that even in this specific setting, the O/ ) boun oes not carry over to the banit SCO setting i.e. in terms of regret), since the algorithm requires us to query far away from w t. Also, we again emphasize that this result oes not contraict our lower boun in the quaratic case hm. 3), since the setting there inclue a generic noise term, while here the stochastic noise has a very specific structure. As to the proof of hm. 8, it is very similar to that of hm. 1, the key ifference being a better moment upper boun on the graient estimate ḡ, as formalize in the following lemma. Plugging this improve boun into the calculations results in the theorem. ) 3. Similar to Algorithm 1, if one can only query at some istance ɛ, where ɛ 0, 1, then one can moify the algorithm to hanle such cases, with the resulting error boun epening on ɛ., 15

16 Shamir Lemma 9 For any w t, we have that E r,v [ g is a subgraient of F w t ), an [ )) E r,v [ g 4 N + 3 B + 1) 4 + E  F. Proof By efinition of F w t ), we note that g = w t + r)  w t + r) + ˆb ) w t + r) + ĉ r + g R w t ). Using a similar calculation to the one in the proof of Lemma, we have that the expecte value of this expression over r an Â, ˆb, ĉ is w t E[ + E[ˆb + g R w t ), which is a subgraient of F w t ). As to the moment boun, we have [ E[ g E 4 w t + r)  w t + r)) r ˆb ) + 4 w t + r) r + 4ĉ r + 4 g R w t ) [ 4 E  w t + wt Âr + r Âr) ˆb ) ˆb ) ) + w t + r N = 4 E = 1 [ B + w t B 4 + 4E ) Âr + r Ar + B ˆb ) ) + r [ ) [ ) ) wt Âr + E r Ar N [ ˆb ) ) B + E r N. Letting â i,j enote entry i, j) in Â, an recalling that by efinition of r, E[r ir j = 1 i=j, we have that [ ) E r Ar = E r i r j â i,j = E r i r j r i r j â i,j â i,j i,j i,j,i,j = E ri rj â i,j = E [ â i,j = E  F. i,j i,j 7) Also, using the fact that E[rr is the ientity matrix, we have [ ) [ [ [ E wt Âr = E wt Ârr  w t = E wt  w t E w t  B. Finally, we have [ ˆb ) E r [ˆb = E rr ˆb [ = E ˆb 1. Plugging these inequalities back into Eq. 7), we get that ) E[ g 1 B 4 + 4B + E[  F + 8 B + 1 ) N [ ) = 4 3B B E  F + 4N [ ) 1 B + 1) 4 + E  F + 4N, 16

17 Complexity of Banit an Derivative-Free Stochastic Convex Optimization from which the lemma follows. Appenix B. Aitional Proofs B.1. Proof of Lemma By the way r is picke, we have that E r [r i r j = 1 i=j an that E r [r i r j r k = 0 for all i, j, k. hus, letting E enote expectation w.r.t. r an the ranom function values, we have [ v E[ g = E = E = E ɛ [ ɛ [ ɛ r w + ɛ r) A w + ɛ ) r + b w + ɛ ) ) r + c + ξ wt+ ɛ r r ) w Aw + b w + c + ξ wt+ ɛ r r + ɛ ) [ ) ) r Ar r + E w Ar r + b r r = 0 + w A + b + 0 = F w). Also, by the assumptions on A, b, c an the assumptions on the noise ξ w, we have [ F w t + ɛ ) ) r + ξ wt+ ɛ r [ v E[ g = E ɛ r [ ɛ E as require. ɛ ɛ F w t + sup w: w B+ɛ sup w: w B+ɛ = ɛ E[v = ɛ E ɛ r)) + ξ w t+ ɛ r { F w)) + max 1, w t + ɛ } ) r ) ) w Aw + b w + c + B + 1) ɛ B + ɛ) +B + ɛ) + 1 ) + B + 1) ) 4 B + 1)4 ɛ 17

18 Shamir B.. Proof of Lemma 4 We have the following: [ E 1 wi e i <0 = Pr w i e i < 0) = 1 Pr w i < 0 e i > 0) + Pr w i > 0 e i < 0)) ) = 1 Pr w i > 0 e i > 0) Pr w i > 0 e i < 0)) ) 1 1 Pr w i > 0 e i > 0) Pr w i > 0 e i < 0) 1 1 Pr w i > 0 e i > 0) Pr w i > 0 e i < 0)), 8) where the last inequality is by the fact that for any values a 1,..., a, it hols that a a a a. Consier without loss of generality) the term corresponing to the first coorinate, namely Pr w 1 > 0 e 1 > 0) Pr w 1 > 0 e 1 < 0)). his term equals Pr{e j } j=) e,...,e Pr{e j } j=) e,...,e sup Pr e,...,e 1 D kl ) )) ) Pr w 1 > 0 e 1 > 0, {e j } j= Pr w 1 > 0 e 1 < 0, {e j } j= ) )) Pr w 1 > 0 e 1 > 0, {e j } j= Pr w 1 > 0 e 1 < 0, {e j } j= ) )) w 1 > 0 e 1 > 0, {e j } j= Pr w 1 > 0 e 1 < 0, {e j } j= By Pinsker s inequality an the assumption that w is a eterministic function of v 1,..., v, this expression is at most Pr v 1,..., v e 1 > 0, {e j } j= ) )) Pr v 1,..., v e 1 < 0, {e j } j=, where D kl P Q) is the Kullback-Leibler ivergence between the two istributions. By the chain rule see e.g. Cover an homas 006)), we can upper boun the above by 1 ) D kl Pr v t e 1 > 0, {e j } j=, {v l } t 1 l=1 Plugging these bouns back into Eq. 8), the result follows. )) Pr v t e 1 < 0, {e j } j=, {v l } t 1 l=1. 18

19 Complexity of Banit an Derivative-Free Stochastic Convex Optimization B.3. Proof of hm. 6 We may assume without loss of generality that, an it is enough to show that the expecte average regret is at least 0.0 /. his is because if there was a strategy with < 0.0 average regret after < rouns, then for the case of rouns, we coul just run that strategy for rouns, compute the average w of all points playe so far, an then repeately choose w in the remaining rouns. By Jensen s inequality, this woul imply a < 0.0 average regret after rouns, in contraiction. Let w be an arbitrary eterministic function of w 1,..., w. A proof ientical to that of hm. 3, up to Eq. 4), implies that for any µ > 0, there exists a quaratic function of the form F e = 1 w e, w, with e { µ, µ}, such that E[F e w ) F e w ) E µ 4 1 µ { } min w t 1, w t. In particular, letting w = 1 w t, using Jensen s inequality, an iscaring the min, we get that [ 1 E F e w t ) F e w ) µ 1 µ w t 4. 9) However, we also know that by strong convexity of F e, we have [ 1 E F e w t ) F e w ) 1 w t e. 10) Using the fact that w t = w t e + e w t e + e ) w t e + e, we get that w t e 1 w t e = 1 w t µ. Substituting into Eq. 10) an slightly manipulating the resulting inequality, we get [ w t 1 4 E F e w t ) F e w ) + µ. [ For simplicity, enote the average regret term E 1 F ew t ) F e w ) by R. Substituting the expression above into Eq. 9), we get ) R µ µ R + µ ) µ 4µ 1 R ) µ

20 Shamir Rearranging an simplifying, we get R + µ3 R + µ µ ) he equation above can be seen as a quaratic function of R, with the roots ) 1 µ3 ± µ3 + µ 1 µ ). Now, recall that µ is a free parameter that we can choose at will. If we choose it so that 1 µ > 0, then it is easy to show that we get two roots, one strictly positive an one strictly negative. Since we know R is a nonnegative quantity, we get that 1 R = µ3 + µ3 ) + µ 1 µ ) ) µ µ + 4 µ4 + 1 µ. Finally, choosing µ = 1/4 / which inee satisfies 1 µ > 0), an simplifying, we get R Recalling that R is the expecte average regret, it only remains to take the square of the two sies. We note that since we assume, then e = µ = / / 1/, as specifie in the theorem statement. B.4. Proof of hm. 7 Let µ > 0 be a parameter to be etermine later. As iscusse in the text, we will look at functions of the form F e w) = w e i w i 1 + w i /e i ), 11) where e is uniformly istribute on { µ, +µ}. Our goal will be to prove a lower boun on the expecte optimization error over the ranomize choice of F e, with respect to eterministic querying strategies. As explaine in the proof of hm. 3, this woul imply the existence of some fixe F e such that the expecte optimization error over a possibly ranomize) querying strategy is the same. We will nee the following properties of F e : Lemma 10 For any µ > 0 an any e { µ, +µ}, the function F e in Eq. 11) is: ˆ 0.5-Strongly convex an 3.5-smooth 0

21 Complexity of Banit an Derivative-Free Stochastic Convex Optimization ˆ + µ-lipschitz for any w such that w 1. ˆ F e is globally minimize at w = ce, where c = /3 ˆ For any e { µ, +µ} which iffers from e in a single coorinate, an for any w R, it hols that F e w) F e w) µ. Proof Note that we can write the function F e w) as g e i w i ), where g a x) = x ax 1 + x/a). It is not har to realize that to prove the lemma, it is enough to prove that: 1. g a x) is 0.5-strongly convex an 3.5-smooth;. g ax) is at most 4 x + a ; 3. For all µ, g µ x) g µ x) µ ; 4. g a x) is minimize at ca where c = o show item 1, we calculate the secon erivative of g a x), which is 1 + a3 x3a x ) ) a + x ) 3. By efinition of strong convexity an smoothness, it is enough to show that this term is always at least 0.5 an at most 3.5. Substituting x = ay an simplifying, we get 1 + y3 ) y ) 1 + y ) 3. It is a straightforwar exercise to verify that y3 y ) 1+y ) is at most 3/4 for all y R, hence 3 the expression above is always in [0.5, 3.5 as require. As to item, we note that g ax) = x a5 a 3 x 1 x/a) a + x = x a ) 1 + x/a) ). For any value of x/a, the value of the fraction above is easily verifie to be at most 1, hence we can upper boun g ax) by x + a as require. As to item 3, we have g µ x) g µ x) = µx µx = µ 1 + x/µ) µ + x µ, where the last step uses µ + x µx, which follows from the ientity µ + x ) Since this woul imply that F ew) is at most wi + µ) 4w i + µ ) 4w i ) + µ ) = w + µ, which is at most + µ for any w in the unit ball. 1

22 Shamir Finally, as to item 4, we note that this function can be equivalently written as g a x) = a x/a) x/a) ) 1 + x/a). Substituting x = ay, we get a y y/1 + y )). A numerical calculation reveals that the minimizing value of y is , hence the minimizing value of x is a as require. We now begin to erive the lower boun. Using strong convexity an the lemma, we have [ [ [ 1 E[F w ) F w ) E 4 w w = 1 4 E w i wi ) 1 4 E wi ) 1 wi wi <0 [ [ 1 4 E ei ) 1 wi e 3 i <0 = µ 36 E 1 wi e i <0 1) We now lower boun this term using Lemma 4. o o so, we nee to upper boun the KL ivergence of the query values at roun t uner the two hypotheses e i = +µ an e i = µ, the other coorinates being fixe. We assume each noise term ξ w is a stanar Gaussian ranom variable. hus, the query value that we see is istribute as F e w t ) + ξ w = w j=1 e j w j 1 + w j /e j ) + ξ w. where one of the coorinates i of e is either +µ or µ an the other coorinates are fixe. his is a Gaussian istribution, with mean F e w t ) an variance 1. By Lemma 10, the ifference between the two means uner the two cases e i = +µ, e i = µ is at most µ, so by Lemma 5, the KL-ivergence is at most µ 4 /. Using Lemma 4, this implies that Eq. 1) is at least µ 1 1 ) µ 4 = µ µ Picking µ = 1/4, we get a lower boun of /144 > /. Finally, note that for this choice of µ, by Lemma 10, our function F e for any realization of e) is + / - Lipschitz in the unit ball, an has a global minimum with norm at most 0.35 /. If, the Lipschitz parameter is at most 4 an the global minimum is insie the unit ball, satisfying the requirements in the theorem statement. If <, then the boun cannot be better than what we woul obtain for = the argument is similar to the one in the proof of hm. 6), which is hus, for any, the boun is at least { } { } min 0.004, = min 1, as require.

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback Journal of Machine Learning Research 8 07) - Submitte /6; Publishe 5/7 An Optimal Algorithm for Banit an Zero-Orer Convex Optimization with wo-point Feeback Oha Shamir Department of Computer Science an

More information

Algorithms and matching lower bounds for approximately-convex optimization

Algorithms and matching lower bounds for approximately-convex optimization Algorithms an matching lower bouns for approximately-convex optimization Yuanzhi Li Department of Computer Science Princeton University Princeton, NJ, 08450 yuanzhil@cs.princeton.eu Anrej Risteski Department

More information

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012 CS-6 Theory Gems November 8, 0 Lecture Lecturer: Alesaner Mąry Scribes: Alhussein Fawzi, Dorina Thanou Introuction Toay, we will briefly iscuss an important technique in probability theory measure concentration

More information

Least-Squares Regression on Sparse Spaces

Least-Squares Regression on Sparse Spaces Least-Squares Regression on Sparse Spaces Yuri Grinberg, Mahi Milani Far, Joelle Pineau School of Computer Science McGill University Montreal, Canaa {ygrinb,mmilan1,jpineau}@cs.mcgill.ca 1 Introuction

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 12 EFFICIENT LEARNING So far, our focus has been on moels of learning an basic algorithms for those moels. We have not place much emphasis on how to learn quickly.

More information

Self-normalized Martingale Tail Inequality

Self-normalized Martingale Tail Inequality Online-to-Confience-Set Conversions an Application to Sparse Stochastic Banits A Self-normalize Martingale Tail Inequality The self-normalize martingale tail inequality that we present here is the scalar-value

More information

Lower Bounds for the Smoothed Number of Pareto optimal Solutions

Lower Bounds for the Smoothed Number of Pareto optimal Solutions Lower Bouns for the Smoothe Number of Pareto optimal Solutions Tobias Brunsch an Heiko Röglin Department of Computer Science, University of Bonn, Germany brunsch@cs.uni-bonn.e, heiko@roeglin.org Abstract.

More information

Agmon Kolmogorov Inequalities on l 2 (Z d )

Agmon Kolmogorov Inequalities on l 2 (Z d ) Journal of Mathematics Research; Vol. 6, No. ; 04 ISSN 96-9795 E-ISSN 96-9809 Publishe by Canaian Center of Science an Eucation Agmon Kolmogorov Inequalities on l (Z ) Arman Sahovic Mathematics Department,

More information

Separation of Variables

Separation of Variables Physics 342 Lecture 1 Separation of Variables Lecture 1 Physics 342 Quantum Mechanics I Monay, January 25th, 2010 There are three basic mathematical tools we nee, an then we can begin working on the physical

More information

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k A Proof of Lemma 2 B Proof of Lemma 3 Proof: Since the support of LL istributions is R, two such istributions are equivalent absolutely continuous with respect to each other an the ivergence is well-efine

More information

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013 Survey Sampling Kosuke Imai Department of Politics, Princeton University February 19, 2013 Survey sampling is one of the most commonly use ata collection methos for social scientists. We begin by escribing

More information

Euler equations for multiple integrals

Euler equations for multiple integrals Euler equations for multiple integrals January 22, 2013 Contents 1 Reminer of multivariable calculus 2 1.1 Vector ifferentiation......................... 2 1.2 Matrix ifferentiation........................

More information

Analyzing Tensor Power Method Dynamics in Overcomplete Regime

Analyzing Tensor Power Method Dynamics in Overcomplete Regime Journal of Machine Learning Research 18 (2017) 1-40 Submitte 9/15; Revise 11/16; Publishe 4/17 Analyzing Tensor Power Metho Dynamics in Overcomplete Regime Animashree Ananumar Department of Electrical

More information

arxiv: v2 [cs.ds] 11 May 2016

arxiv: v2 [cs.ds] 11 May 2016 Optimizing Star-Convex Functions Jasper C.H. Lee Paul Valiant arxiv:5.04466v2 [cs.ds] May 206 Department of Computer Science Brown University {jasperchlee,paul_valiant}@brown.eu May 3, 206 Abstract We

More information

Linear Regression with Limited Observation

Linear Regression with Limited Observation Ela Hazan Tomer Koren Technion Israel Institute of Technology, Technion City 32000, Haifa, Israel ehazan@ie.technion.ac.il tomerk@cs.technion.ac.il Abstract We consier the most common variants of linear

More information

PDE Notes, Lecture #11

PDE Notes, Lecture #11 PDE Notes, Lecture # from Professor Jalal Shatah s Lectures Febuary 9th, 2009 Sobolev Spaces Recall that for u L loc we can efine the weak erivative Du by Du, φ := udφ φ C0 If v L loc such that Du, φ =

More information

7.1 Support Vector Machine

7.1 Support Vector Machine 67577 Intro. to Machine Learning Fall semester, 006/7 Lecture 7: Support Vector Machines an Kernel Functions II Lecturer: Amnon Shashua Scribe: Amnon Shashua 7. Support Vector Machine We return now to

More information

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x)

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x) Y. D. Chong (2016) MH2801: Complex Methos for the Sciences 1. Derivatives The erivative of a function f(x) is another function, efine in terms of a limiting expression: f (x) f (x) lim x δx 0 f(x + δx)

More information

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks A PAC-Bayesian Approach to Spectrally-Normalize Margin Bouns for Neural Networks Behnam Neyshabur, Srinah Bhojanapalli, Davi McAllester, Nathan Srebro Toyota Technological Institute at Chicago {bneyshabur,

More information

Math 342 Partial Differential Equations «Viktor Grigoryan

Math 342 Partial Differential Equations «Viktor Grigoryan Math 342 Partial Differential Equations «Viktor Grigoryan 6 Wave equation: solution In this lecture we will solve the wave equation on the entire real line x R. This correspons to a string of infinite

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Technical Report TTI-TR-2008-5 Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri UC San Diego Sham M. Kakae Toyota Technological Institute at Chicago ABSTRACT Clustering ata in

More information

Calculus and optimization

Calculus and optimization Calculus an optimization These notes essentially correspon to mathematical appenix 2 in the text. 1 Functions of a single variable Now that we have e ne functions we turn our attention to calculus. A function

More information

u!i = a T u = 0. Then S satisfies

u!i = a T u = 0. Then S satisfies Deterministic Conitions for Subspace Ientifiability from Incomplete Sampling Daniel L Pimentel-Alarcón, Nigel Boston, Robert D Nowak University of Wisconsin-Maison Abstract Consier an r-imensional subspace

More information

Topic 7: Convergence of Random Variables

Topic 7: Convergence of Random Variables Topic 7: Convergence of Ranom Variables Course 003, 2016 Page 0 The Inference Problem So far, our starting point has been a given probability space (S, F, P). We now look at how to generate information

More information

A Sketch of Menshikov s Theorem

A Sketch of Menshikov s Theorem A Sketch of Menshikov s Theorem Thomas Bao March 14, 2010 Abstract Let Λ be an infinite, locally finite oriente multi-graph with C Λ finite an strongly connecte, an let p

More information

Lower bounds on Locality Sensitive Hashing

Lower bounds on Locality Sensitive Hashing Lower bouns on Locality Sensitive Hashing Rajeev Motwani Assaf Naor Rina Panigrahy Abstract Given a metric space (X, X ), c 1, r > 0, an p, q [0, 1], a istribution over mappings H : X N is calle a (r,

More information

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments 2 Conference on Information Sciences an Systems, The Johns Hopkins University, March 2, 2 Time-of-Arrival Estimation in Non-Line-Of-Sight Environments Sinan Gezici, Hisashi Kobayashi an H. Vincent Poor

More information

Acute sets in Euclidean spaces

Acute sets in Euclidean spaces Acute sets in Eucliean spaces Viktor Harangi April, 011 Abstract A finite set H in R is calle an acute set if any angle etermine by three points of H is acute. We examine the maximal carinality α() of

More information

Linear First-Order Equations

Linear First-Order Equations 5 Linear First-Orer Equations Linear first-orer ifferential equations make up another important class of ifferential equations that commonly arise in applications an are relatively easy to solve (in theory)

More information

Diophantine Approximations: Examining the Farey Process and its Method on Producing Best Approximations

Diophantine Approximations: Examining the Farey Process and its Method on Producing Best Approximations Diophantine Approximations: Examining the Farey Process an its Metho on Proucing Best Approximations Kelly Bowen Introuction When a person hears the phrase irrational number, one oes not think of anything

More information

LECTURE NOTES ON DVORETZKY S THEOREM

LECTURE NOTES ON DVORETZKY S THEOREM LECTURE NOTES ON DVORETZKY S THEOREM STEVEN HEILMAN Abstract. We present the first half of the paper [S]. In particular, the results below, unless otherwise state, shoul be attribute to G. Schechtman.

More information

Introduction to the Vlasov-Poisson system

Introduction to the Vlasov-Poisson system Introuction to the Vlasov-Poisson system Simone Calogero 1 The Vlasov equation Consier a particle with mass m > 0. Let x(t) R 3 enote the position of the particle at time t R an v(t) = ẋ(t) = x(t)/t its

More information

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions Working Paper 2013:5 Department of Statistics Computing Exact Confience Coefficients of Simultaneous Confience Intervals for Multinomial Proportions an their Functions Shaobo Jin Working Paper 2013:5

More information

How to Minimize Maximum Regret in Repeated Decision-Making

How to Minimize Maximum Regret in Repeated Decision-Making How to Minimize Maximum Regret in Repeate Decision-Making Karl H. Schlag July 3 2003 Economics Department, European University Institute, Via ella Piazzuola 43, 033 Florence, Italy, Tel: 0039-0-4689, email:

More information

Problem Sheet 2: Eigenvalues and eigenvectors and their use in solving linear ODEs

Problem Sheet 2: Eigenvalues and eigenvectors and their use in solving linear ODEs Problem Sheet 2: Eigenvalues an eigenvectors an their use in solving linear ODEs If you fin any typos/errors in this problem sheet please email jk28@icacuk The material in this problem sheet is not examinable

More information

The Exact Form and General Integrating Factors

The Exact Form and General Integrating Factors 7 The Exact Form an General Integrating Factors In the previous chapters, we ve seen how separable an linear ifferential equations can be solve using methos for converting them to forms that can be easily

More information

Lecture 2: Correlated Topic Model

Lecture 2: Correlated Topic Model Probabilistic Moels for Unsupervise Learning Spring 203 Lecture 2: Correlate Topic Moel Inference for Correlate Topic Moel Yuan Yuan First of all, let us make some claims about the parameters an variables

More information

Necessary and Sufficient Conditions for Sketched Subspace Clustering

Necessary and Sufficient Conditions for Sketched Subspace Clustering Necessary an Sufficient Conitions for Sketche Subspace Clustering Daniel Pimentel-Alarcón, Laura Balzano 2, Robert Nowak University of Wisconsin-Maison, 2 University of Michigan-Ann Arbor Abstract This

More information

05 The Continuum Limit and the Wave Equation

05 The Continuum Limit and the Wave Equation Utah State University DigitalCommons@USU Founations of Wave Phenomena Physics, Department of 1-1-2004 05 The Continuum Limit an the Wave Equation Charles G. Torre Department of Physics, Utah State University,

More information

On combinatorial approaches to compressed sensing

On combinatorial approaches to compressed sensing On combinatorial approaches to compresse sensing Abolreza Abolhosseini Moghaam an Hayer Raha Department of Electrical an Computer Engineering, Michigan State University, East Lansing, MI, U.S. Emails:{abolhos,raha}@msu.eu

More information

A Unified Theorem on SDP Rank Reduction

A Unified Theorem on SDP Rank Reduction A Unifie heorem on SDP Ran Reuction Anthony Man Cho So, Yinyu Ye, Jiawei Zhang November 9, 006 Abstract We consier the problem of fining a low ran approximate solution to a system of linear equations in

More information

A new proof of the sharpness of the phase transition for Bernoulli percolation on Z d

A new proof of the sharpness of the phase transition for Bernoulli percolation on Z d A new proof of the sharpness of the phase transition for Bernoulli percolation on Z Hugo Duminil-Copin an Vincent Tassion October 8, 205 Abstract We provie a new proof of the sharpness of the phase transition

More information

Table of Common Derivatives By David Abraham

Table of Common Derivatives By David Abraham Prouct an Quotient Rules: Table of Common Derivatives By Davi Abraham [ f ( g( ] = [ f ( ] g( + f ( [ g( ] f ( = g( [ f ( ] g( g( f ( [ g( ] Trigonometric Functions: sin( = cos( cos( = sin( tan( = sec

More information

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION The Annals of Statistics 1997, Vol. 25, No. 6, 2313 2327 LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION By Eva Riccomagno, 1 Rainer Schwabe 2 an Henry P. Wynn 1 University of Warwick, Technische

More information

Function Spaces. 1 Hilbert Spaces

Function Spaces. 1 Hilbert Spaces Function Spaces A function space is a set of functions F that has some structure. Often a nonparametric regression function or classifier is chosen to lie in some function space, where the assume structure

More information

Hyperbolic Systems of Equations Posed on Erroneous Curved Domains

Hyperbolic Systems of Equations Posed on Erroneous Curved Domains Hyperbolic Systems of Equations Pose on Erroneous Curve Domains Jan Norström a, Samira Nikkar b a Department of Mathematics, Computational Mathematics, Linköping University, SE-58 83 Linköping, Sween (

More information

Connections Between Duality in Control Theory and

Connections Between Duality in Control Theory and Connections Between Duality in Control heory an Convex Optimization V. Balakrishnan 1 an L. Vanenberghe 2 Abstract Several important problems in control theory can be reformulate as convex optimization

More information

Parameter estimation: A new approach to weighting a priori information

Parameter estimation: A new approach to weighting a priori information Parameter estimation: A new approach to weighting a priori information J.L. Mea Department of Mathematics, Boise State University, Boise, ID 83725-555 E-mail: jmea@boisestate.eu Abstract. We propose a

More information

Proof of SPNs as Mixture of Trees

Proof of SPNs as Mixture of Trees A Proof of SPNs as Mixture of Trees Theorem 1. If T is an inuce SPN from a complete an ecomposable SPN S, then T is a tree that is complete an ecomposable. Proof. Argue by contraiction that T is not a

More information

Calculus of Variations

Calculus of Variations 16.323 Lecture 5 Calculus of Variations Calculus of Variations Most books cover this material well, but Kirk Chapter 4 oes a particularly nice job. x(t) x* x*+ αδx (1) x*- αδx (1) αδx (1) αδx (1) t f t

More information

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs Lectures - Week 10 Introuction to Orinary Differential Equations (ODES) First Orer Linear ODEs When stuying ODEs we are consiering functions of one inepenent variable, e.g., f(x), where x is the inepenent

More information

Chapter 4. Electrostatics of Macroscopic Media

Chapter 4. Electrostatics of Macroscopic Media Chapter 4. Electrostatics of Macroscopic Meia 4.1 Multipole Expansion Approximate potentials at large istances 3 x' x' (x') x x' x x Fig 4.1 We consier the potential in the far-fiel region (see Fig. 4.1

More information

arxiv:hep-th/ v1 3 Feb 1993

arxiv:hep-th/ v1 3 Feb 1993 NBI-HE-9-89 PAR LPTHE 9-49 FTUAM 9-44 November 99 Matrix moel calculations beyon the spherical limit arxiv:hep-th/93004v 3 Feb 993 J. Ambjørn The Niels Bohr Institute Blegamsvej 7, DK-00 Copenhagen Ø,

More information

All s Well That Ends Well: Supplementary Proofs

All s Well That Ends Well: Supplementary Proofs All s Well That Ens Well: Guarantee Resolution of Simultaneous Rigi Boy Impact 1:1 All s Well That Ens Well: Supplementary Proofs This ocument complements the paper All s Well That Ens Well: Guarantee

More information

Optimization of Geometries by Energy Minimization

Optimization of Geometries by Energy Minimization Optimization of Geometries by Energy Minimization by Tracy P. Hamilton Department of Chemistry University of Alabama at Birmingham Birmingham, AL 3594-140 hamilton@uab.eu Copyright Tracy P. Hamilton, 1997.

More information

Equilibrium in Queues Under Unknown Service Times and Service Value

Equilibrium in Queues Under Unknown Service Times and Service Value University of Pennsylvania ScholarlyCommons Finance Papers Wharton Faculty Research 1-2014 Equilibrium in Queues Uner Unknown Service Times an Service Value Laurens Debo Senthil K. Veeraraghavan University

More information

Convergence of Random Walks

Convergence of Random Walks Chapter 16 Convergence of Ranom Walks This lecture examines the convergence of ranom walks to the Wiener process. This is very important both physically an statistically, an illustrates the utility of

More information

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control

19 Eigenvalues, Eigenvectors, Ordinary Differential Equations, and Control 19 Eigenvalues, Eigenvectors, Orinary Differential Equations, an Control This section introuces eigenvalues an eigenvectors of a matrix, an iscusses the role of the eigenvalues in etermining the behavior

More information

Tractability results for weighted Banach spaces of smooth functions

Tractability results for weighted Banach spaces of smooth functions Tractability results for weighte Banach spaces of smooth functions Markus Weimar Mathematisches Institut, Universität Jena Ernst-Abbe-Platz 2, 07740 Jena, Germany email: markus.weimar@uni-jena.e March

More information

Quantum Mechanics in Three Dimensions

Quantum Mechanics in Three Dimensions Physics 342 Lecture 20 Quantum Mechanics in Three Dimensions Lecture 20 Physics 342 Quantum Mechanics I Monay, March 24th, 2008 We begin our spherical solutions with the simplest possible case zero potential.

More information

Balancing Expected and Worst-Case Utility in Contracting Models with Asymmetric Information and Pooling

Balancing Expected and Worst-Case Utility in Contracting Models with Asymmetric Information and Pooling Balancing Expecte an Worst-Case Utility in Contracting Moels with Asymmetric Information an Pooling R.B.O. erkkamp & W. van en Heuvel & A.P.M. Wagelmans Econometric Institute Report EI2018-01 9th January

More information

Slide10 Haykin Chapter 14: Neurodynamics (3rd Ed. Chapter 13)

Slide10 Haykin Chapter 14: Neurodynamics (3rd Ed. Chapter 13) Slie10 Haykin Chapter 14: Neuroynamics (3r E. Chapter 13) CPSC 636-600 Instructor: Yoonsuck Choe Spring 2012 Neural Networks with Temporal Behavior Inclusion of feeback gives temporal characteristics to

More information

Homework 2 Solutions EM, Mixture Models, PCA, Dualitys

Homework 2 Solutions EM, Mixture Models, PCA, Dualitys Homewor Solutions EM, Mixture Moels, PCA, Dualitys CMU 0-75: Machine Learning Fall 05 http://www.cs.cmu.eu/~bapoczos/classes/ml075_05fall/ OUT: Oct 5, 05 DUE: Oct 9, 05, 0:0 AM An EM algorithm for a Mixture

More information

Subspace Estimation from Incomplete Observations: A High-Dimensional Analysis

Subspace Estimation from Incomplete Observations: A High-Dimensional Analysis Subspace Estimation from Incomplete Observations: A High-Dimensional Analysis Chuang Wang, Yonina C. Elar, Fellow, IEEE an Yue M. Lu, Senior Member, IEEE Abstract We present a high-imensional analysis

More information

On the Surprising Behavior of Distance Metrics in High Dimensional Space

On the Surprising Behavior of Distance Metrics in High Dimensional Space On the Surprising Behavior of Distance Metrics in High Dimensional Space Charu C. Aggarwal, Alexaner Hinneburg 2, an Daniel A. Keim 2 IBM T. J. Watson Research Center Yortown Heights, NY 0598, USA. charu@watson.ibm.com

More information

Math 1B, lecture 8: Integration by parts

Math 1B, lecture 8: Integration by parts Math B, lecture 8: Integration by parts Nathan Pflueger 23 September 2 Introuction Integration by parts, similarly to integration by substitution, reverses a well-known technique of ifferentiation an explores

More information

Concentration of Measure Inequalities for Compressive Toeplitz Matrices with Applications to Detection and System Identification

Concentration of Measure Inequalities for Compressive Toeplitz Matrices with Applications to Detection and System Identification Concentration of Measure Inequalities for Compressive Toeplitz Matrices with Applications to Detection an System Ientification Borhan M Sananaji, Tyrone L Vincent, an Michael B Wakin Abstract In this paper,

More information

Lecture 6 : Dimensionality Reduction

Lecture 6 : Dimensionality Reduction CPS290: Algorithmic Founations of Data Science February 3, 207 Lecture 6 : Dimensionality Reuction Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will consier the roblem of maing

More information

1 dx. where is a large constant, i.e., 1, (7.6) and Px is of the order of unity. Indeed, if px is given by (7.5), the inequality (7.

1 dx. where is a large constant, i.e., 1, (7.6) and Px is of the order of unity. Indeed, if px is given by (7.5), the inequality (7. Lectures Nine an Ten The WKB Approximation The WKB metho is a powerful tool to obtain solutions for many physical problems It is generally applicable to problems of wave propagation in which the frequency

More information

12.11 Laplace s Equation in Cylindrical and

12.11 Laplace s Equation in Cylindrical and SEC. 2. Laplace s Equation in Cylinrical an Spherical Coorinates. Potential 593 2. Laplace s Equation in Cylinrical an Spherical Coorinates. Potential One of the most important PDEs in physics an engineering

More information

Iterated Point-Line Configurations Grow Doubly-Exponentially

Iterated Point-Line Configurations Grow Doubly-Exponentially Iterate Point-Line Configurations Grow Doubly-Exponentially Joshua Cooper an Mark Walters July 9, 008 Abstract Begin with a set of four points in the real plane in general position. A to this collection

More information

Schrödinger s equation.

Schrödinger s equation. Physics 342 Lecture 5 Schröinger s Equation Lecture 5 Physics 342 Quantum Mechanics I Wenesay, February 3r, 2010 Toay we iscuss Schröinger s equation an show that it supports the basic interpretation of

More information

Implicit Differentiation

Implicit Differentiation Implicit Differentiation Thus far, the functions we have been concerne with have been efine explicitly. A function is efine explicitly if the output is given irectly in terms of the input. For instance,

More information

Approximate Constraint Satisfaction Requires Large LP Relaxations

Approximate Constraint Satisfaction Requires Large LP Relaxations Approximate Constraint Satisfaction Requires Large LP Relaxations oah Fleming April 19, 2018 Linear programming is a very powerful tool for attacking optimization problems. Techniques such as the ellipsoi

More information

WUCHEN LI AND STANLEY OSHER

WUCHEN LI AND STANLEY OSHER CONSTRAINED DYNAMICAL OPTIMAL TRANSPORT AND ITS LAGRANGIAN FORMULATION WUCHEN LI AND STANLEY OSHER Abstract. We propose ynamical optimal transport (OT) problems constraine in a parameterize probability

More information

Logarithmic spurious regressions

Logarithmic spurious regressions Logarithmic spurious regressions Robert M. e Jong Michigan State University February 5, 22 Abstract Spurious regressions, i.e. regressions in which an integrate process is regresse on another integrate

More information

On the number of isolated eigenvalues of a pair of particles in a quantum wire

On the number of isolated eigenvalues of a pair of particles in a quantum wire On the number of isolate eigenvalues of a pair of particles in a quantum wire arxiv:1812.11804v1 [math-ph] 31 Dec 2018 Joachim Kerner 1 Department of Mathematics an Computer Science FernUniversität in

More information

QF101: Quantitative Finance September 5, Week 3: Derivatives. Facilitator: Christopher Ting AY 2017/2018. f ( x + ) f(x) f(x) = lim

QF101: Quantitative Finance September 5, Week 3: Derivatives. Facilitator: Christopher Ting AY 2017/2018. f ( x + ) f(x) f(x) = lim QF101: Quantitative Finance September 5, 2017 Week 3: Derivatives Facilitator: Christopher Ting AY 2017/2018 I recoil with ismay an horror at this lamentable plague of functions which o not have erivatives.

More information

Entanglement is not very useful for estimating multiple phases

Entanglement is not very useful for estimating multiple phases PHYSICAL REVIEW A 70, 032310 (2004) Entanglement is not very useful for estimating multiple phases Manuel A. Ballester* Department of Mathematics, University of Utrecht, Box 80010, 3508 TA Utrecht, The

More information

arxiv: v2 [math.pr] 27 Nov 2018

arxiv: v2 [math.pr] 27 Nov 2018 Range an spee of rotor wals on trees arxiv:15.57v [math.pr] 7 Nov 1 Wilfrie Huss an Ecaterina Sava-Huss November, 1 Abstract We prove a law of large numbers for the range of rotor wals with ranom initial

More information

Lecture 6: Calculus. In Song Kim. September 7, 2011

Lecture 6: Calculus. In Song Kim. September 7, 2011 Lecture 6: Calculus In Song Kim September 7, 20 Introuction to Differential Calculus In our previous lecture we came up with several ways to analyze functions. We saw previously that the slope of a linear

More information

26.1 Metropolis method

26.1 Metropolis method CS880: Approximations Algorithms Scribe: Dave Anrzejewski Lecturer: Shuchi Chawla Topic: Metropolis metho, volume estimation Date: 4/26/07 The previous lecture iscusse they some of the key concepts of

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Keywors: multi-view learning, clustering, canonical correlation analysis Abstract Clustering ata in high-imensions is believe to be a har problem in general. A number of efficient clustering algorithms

More information

arxiv: v4 [math.pr] 27 Jul 2016

arxiv: v4 [math.pr] 27 Jul 2016 The Asymptotic Distribution of the Determinant of a Ranom Correlation Matrix arxiv:309768v4 mathpr] 7 Jul 06 AM Hanea a, & GF Nane b a Centre of xcellence for Biosecurity Risk Analysis, University of Melbourne,

More information

A note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz

A note on asymptotic formulae for one-dimensional network flow problems Carlos F. Daganzo and Karen R. Smilowitz A note on asymptotic formulae for one-imensional network flow problems Carlos F. Daganzo an Karen R. Smilowitz (to appear in Annals of Operations Research) Abstract This note evelops asymptotic formulae

More information

Leaving Randomness to Nature: d-dimensional Product Codes through the lens of Generalized-LDPC codes

Leaving Randomness to Nature: d-dimensional Product Codes through the lens of Generalized-LDPC codes Leaving Ranomness to Nature: -Dimensional Prouct Coes through the lens of Generalize-LDPC coes Tavor Baharav, Kannan Ramchanran Dept. of Electrical Engineering an Computer Sciences, U.C. Berkeley {tavorb,

More information

Thermal conductivity of graded composites: Numerical simulations and an effective medium approximation

Thermal conductivity of graded composites: Numerical simulations and an effective medium approximation JOURNAL OF MATERIALS SCIENCE 34 (999)5497 5503 Thermal conuctivity of grae composites: Numerical simulations an an effective meium approximation P. M. HUI Department of Physics, The Chinese University

More information

MA 2232 Lecture 08 - Review of Log and Exponential Functions and Exponential Growth

MA 2232 Lecture 08 - Review of Log and Exponential Functions and Exponential Growth MA 2232 Lecture 08 - Review of Log an Exponential Functions an Exponential Growth Friay, February 2, 2018. Objectives: Review log an exponential functions, their erivative an integration formulas. Exponential

More information

3.7 Implicit Differentiation -- A Brief Introduction -- Student Notes

3.7 Implicit Differentiation -- A Brief Introduction -- Student Notes Fin these erivatives of these functions: y.7 Implicit Differentiation -- A Brief Introuction -- Stuent Notes tan y sin tan = sin y e = e = Write the inverses of these functions: y tan y sin How woul we

More information

Permanent vs. Determinant

Permanent vs. Determinant Permanent vs. Determinant Frank Ban Introuction A major problem in theoretical computer science is the Permanent vs. Determinant problem. It asks: given an n by n matrix of ineterminates A = (a i,j ) an

More information

Perturbation Analysis and Optimization of Stochastic Flow Networks

Perturbation Analysis and Optimization of Stochastic Flow Networks IEEE TRANSACTIONS ON AUTOMATIC CONTROL, VOL. XX, NO. Y, MMM 2004 1 Perturbation Analysis an Optimization of Stochastic Flow Networks Gang Sun, Christos G. Cassanras, Yorai Wari, Christos G. Panayiotou,

More information

Introduction to variational calculus: Lecture notes 1

Introduction to variational calculus: Lecture notes 1 October 10, 2006 Introuction to variational calculus: Lecture notes 1 Ewin Langmann Mathematical Physics, KTH Physics, AlbaNova, SE-106 91 Stockholm, Sween Abstract I give an informal summary of variational

More information

Introduction to Markov Processes

Introduction to Markov Processes Introuction to Markov Processes Connexions moule m44014 Zzis law Gustav) Meglicki, Jr Office of the VP for Information Technology Iniana University RCS: Section-2.tex,v 1.24 2012/12/21 18:03:08 gustav

More information

Calculus of Variations

Calculus of Variations Calculus of Variations Lagrangian formalism is the main tool of theoretical classical mechanics. Calculus of Variations is a part of Mathematics which Lagrangian formalism is base on. In this section,

More information

Range and speed of rotor walks on trees

Range and speed of rotor walks on trees Range an spee of rotor wals on trees Wilfrie Huss an Ecaterina Sava-Huss May 15, 1 Abstract We prove a law of large numbers for the range of rotor wals with ranom initial configuration on regular trees

More information

arxiv: v2 [math.dg] 16 Dec 2014

arxiv: v2 [math.dg] 16 Dec 2014 A ONOTONICITY FORULA AND TYPE-II SINGULARITIES FOR THE EAN CURVATURE FLOW arxiv:1312.4775v2 [math.dg] 16 Dec 2014 YONGBING ZHANG Abstract. In this paper, we introuce a monotonicity formula for the mean

More information

Lower Bounds for Local Monotonicity Reconstruction from Transitive-Closure Spanners

Lower Bounds for Local Monotonicity Reconstruction from Transitive-Closure Spanners Lower Bouns for Local Monotonicity Reconstruction from Transitive-Closure Spanners Arnab Bhattacharyya Elena Grigorescu Mahav Jha Kyomin Jung Sofya Raskhonikova Davi P. Wooruff Abstract Given a irecte

More information

Collaborative Ranking for Local Preferences Supplement

Collaborative Ranking for Local Preferences Supplement Collaborative Raning for Local Preferences Supplement Ber apicioglu Davi S Rosenberg Robert E Schapire ony Jebara YP YP Princeton University Columbia University Problem Formulation Let U {,,m} be the set

More information

Robustness and Perturbations of Minimal Bases

Robustness and Perturbations of Minimal Bases Robustness an Perturbations of Minimal Bases Paul Van Dooren an Froilán M Dopico December 9, 2016 Abstract Polynomial minimal bases of rational vector subspaces are a classical concept that plays an important

More information

A Review of Multiple Try MCMC algorithms for Signal Processing

A Review of Multiple Try MCMC algorithms for Signal Processing A Review of Multiple Try MCMC algorithms for Signal Processing Luca Martino Image Processing Lab., Universitat e València (Spain) Universia Carlos III e Mari, Leganes (Spain) Abstract Many applications

More information