Algorithms and matching lower bounds for approximately-convex optimization

Size: px
Start display at page:

Download "Algorithms and matching lower bounds for approximately-convex optimization"

Transcription

1 Algorithms an matching lower bouns for approximately-convex optimization Yuanzhi Li Department of Computer Science Princeton University Princeton, NJ, Anrej Risteski Department of Computer Science Princeton University Princeton, NJ, Abstract In recent years, a rapily increasing number of applications in practice requires optimizing non-convex objectives, like training neural networks, learning graphical moels, maximum likelihoo estimation. Though simple heuristics such as graient escent with very few moifications ten to work well, theoretical unerstaning is very weak. We consier possibly the most natural class of non-convex functions where one coul hope to obtain provable guarantees: functions that are approximately convex, i.e. functions f : R R for which there exists a convex function f such that for all x, f(x) f(x) for a fixe value. We then want to minimize f, i.e. output a point x such that f( x) min x f(x) +. It is quite natural to conjecture that for fixe, the problem gets harer for larger, however, the exact epenency of an is not known. In this paper, we significantly improve the known lower boun on as a function of an an algorithm matching this lower boun for a natural class of convex boies. More precisely, we ientify a function T : R + R + such that when = O(T ()), we can give an algorithm that outputs a point x such that f( x) min x f(x) + within time poly (, ). On the other han, when = Ω(T ()), we also prove an information theoretic lower boun that any algorithm that outputs such a x must use super polynomial number of evaluations of f. Introuction Optimization of convex functions over a convex omain is a well stuie problem in machine learning, where a variety of algorithms exist to solve the problem efficiently. However, in recent years, practitioners face ever more often non-convex objectives e.g. training neural networks, learning graphical moels, clustering ata, maximum likelihoo estimation etc. Albeit simple heuristics such as graient escent with few moifications usually work very well, theoretical unerstaning in these settings are still largely open. The most natural class of non-convex functions where one coul hope to obtain provable guarantees is functions that are approximately convex : functions f : R R for which there exists a convex function f such that for all x, f(x) f(x) for a fixe value. In this paper, we focus on zero orer optimization of f: an algorithm that outputs a point x such that f( x) min x f(x) +, where the algorithm in the course of its execution is allowe to pick points x R an query the value of f(x). 30th Conference on Neural Information Processing Systems (NIPS 06), Barcelona, Spain.

2 Trivially, one can solve the problem by constructing a -net an search through all the net points. However, such an algorithm requires Ω ( ) evaluations of f, which is highly inefficient in high imension. In this paper, we are intereste in efficient algorithms: algorithms that run in time poly ( ) ( ), (in particular, this implies the algorithm makes poly, evaluations of f). One extreme case of the problem is = 0, which is just stanar convex optimization, where algorithms exist to solve it in polynomial time for every > 0. However, even when is any quantity > 0, none of these algorithms exten without moification. (Inee, we are not imposing any structure on f f like stochasticity.) Of course, when = +, the problem inclues any non-convex optimization, where we cannot hope for an efficient solution for any finite. Therefore, the crucial quantity to stuy is the optimal traeoff of an : For which, the problem can be solve in polynomial time, an for which it can not. In this paper, we stuy the rate of as a function of : We ientify a function T : R + R + such that when = O(T ()), we can give an algorithm that outputs a point x such that f( x) min x f(x) + within time poly (, ) over a natural class of well-conitione convex boies. On the other han, when = Ω(T ()), we also prove an information theoretic lower boun that any algorithm outputs such x must use super polynomial number of evaluations of f. Our result can be summarize as the following two theorems: Theorem (Algorithmic upper boun, informal). There exists an algorithm A that for any function f over a well-conitione convex set in R of iameter which is close to an -Lipschitz convex function f, an ( ({ = O max, })) A fins a point x such that f( x) ( ) min x f(x) + within time poly, The notion of well-conitioning will formally be efine in section 3, but intuitively captures the notion that the convex boy curves in all irections to a goo extent. Theorem (Information theoretic lower boun, informal). For every algorithm A, every,, with ( { = Ω max, }) there exists a function f on a convex set in R of iameter, an f is close to an -Lipschitz convex function f, such that A can not fin a point x with f( x) ( ) min x f(x) + in poly, evaluations of f. Prior work To the best of our knowlege, there are three works on the problem of approximately convex optimization, which we summarize briefly below. On the algorithmic sie, the classical paper by [DKS4] consiere optimizing smooth convex functions over convex boies with smooth bounaries. More precisely, they assume a boun on both the graient an the Hessian of F. Furthermore, they assume that for every small ball centere at a point in the boy, a large proportion of the volume of the ball lies in the boy. Their algorithm is local search: they show that for a sufficiently small r, in a ball of raius r there is with high probability a point which has a smaller value than the current one, as long as the current value is sufficiently larger than the optimum. For constant-smooth functions only, their algorithm applies when = O( ). Also on the algorithmic sie, the work by [BLNR5] consiers -Lipschitz functions, but their algorithm only applies to the case where = O( ) (so not optimal unless = O( )). Their methos rely on sampling log-concave istribution via hit an run walks. The crucial iea is to show that for approximately convex functions, one nees to sample from approximately log-concave The Ω notation hies polylog(/) factors. The assumptions on the iameter of K an the Lipschitz conition are for convenience of stating the results. (See Section?? to exten to arbitrary iameter an Lipschitz constant)

3 istributions, which they show can be one by a form of rejection sampling together with classical methos for sampling log-concave istributions. Finally, [SV5] consier information theoretic lower bouns. They show that when = / / δ no algorithm can, in polynomial time, achieve achieve = δ, when optimizing a convex function over the hypercube. This translates to a super polynomial information theoretic lower boun when = Ω( ). They aitionally give lower bouns when the approximately convex function is multiplicatively, rather than aitively, close to a convex function. 3 We also note a relate problem is zero-orer optimization, where the goal is to minimize a function we only have value oracle access to. The algorithmic motivations here come from various applications where we only have black-box access to the function we are optimizing, an there is a classical line of work on characterizing the oracle complexity of convex optimization.[ny83, NS, DJWW5]. In all of these settings however, the oracles are either noiseless, or the noise is stochastic, usually because the target application is in banit optimization. [AD0, AFH +, Sha] 3 Overview of results Formally, we will consier the following scenario. Definition 3.. A function f : K R will be calle -approximately convex if there exists a -Lipschitz convex function f : K R, s.t. x K, f(x) f(x). For ease of exposition, we also assume that K has iameter 4. We consier the problem of optimizing f, more precisely, we are interesting in fining a point x K, such that f( x) min f(x) + x K We give the following results: Theorem 3. (Information theoretic lower boun). For very constant c, there exists a constant c such that for every algorithm A, every c, there exists a convex set K R with iameter, an -approximate convex function f : K R an [0, /64) 5 such that { max, } ( 3c log ) Such that A fails to output, with probability /, a point x K with f( x) min x K { f(x)} + in o(( )c ) time. In orer to state the upper bouns, we will nee the efinition of a well-conitione boy: Definition 3. (µ-well-conitione). A convex boy K is sai to be µ-well-conitione for µ, if there exists a function F : R R such that K = {x F (x) 0} an for every x K: F (x) F (x) µ. This notion of well-conitioning of a convex boy to the best of our knowlege has not been efine before, but it intuitively captures the notion that the convex boy shoul curve in all irections to a certain extent. In particular, the unit ball has µ =. Theorem 3. (Algorithmic upper boun). Let be a positive integer, δ > 0 be a positive real number,, be two positive real number such that { max µ, } 6348 Then there exists an algorithm A such that on given any -approximate convex function f over a µ-roune convex set K R of iameter, A returns a point x K with probability δ in time poly (,, log ) δ such that f( x) min f(x) + x K 3 Though these are not too ifficult to erive from the aitive ones, consiering the convex boy has iameter boune by. 4 Generalizing to arbitrary Lipschitz constants an iameters is iscusse in Section 6. 5 Since we normalize f to be -Lipschitz an K to have iameter, the problem is only interesting for 3

4 For the reaer wishing to igest a conition-free version of the above result, the following weaker result is also true (an much easier to prove): Theorem 3.3 (Algorithmic upper boun (conition-free)). Let be a positive integer, δ > 0 be a positive real number,, be two positive real number such that max {, } 6348 Then there exists an algorithm A such that on given any -approximate convex function f over a µ-roune convex set K R of iameter, A returns a point x K with probability δ in time poly (,, log ) δ such that f( x) min f(x) + x S(K, ) Where S(K, ) = {x K B (x) K} The result merely states that we can output a value that competes with points well-insie the convex boy aroun which a ball of raius of still lies insie the boy. The assumptions on the iameter of K an the Lipschitz conition are for convenience of stating the results. It s quite easy to exten both the lower an upper bouns to an arbitrary iameter an Lipschitz constant, as we iscuss in Section Proof techniques We briefly outline the proof techniques we use. We procee with the information theoretic lower boun first. The iea behin the proof is the following. We will construct a function G(x) an a family of convex functions {f w (x)} epening on a irection w S (S is the unit sphere in R ). On one han, the minimal value of G an f w are quite ifferent: min x G(x) 0, an min x f w (x). On the other han, the approximately convex function f w (x) for f w (x) we consier will be such that f w (x) = G(x) except in a very small cone aroun w. Picking w at ranom, no algorithm with small number of queries will, with high probability, every query a point in this cone. Therefore, the algorithm will procee as if the function is G(x) an fail to optimize f w. Proceeing to the algorithmic result, since [BLNR5] alreay shows the existence of an efficient algorithm when = O( ), we only nee to give an algorithm that solves the problem when = Ω( ) an = O( ) (i.e. when, are large). There are two main ieas for the algorithm. First, we show that the graient of a smoothe version of f w (in the spirit of [FKM05]) at any point x will be correlate with x x, where x = argmin x K fw (x). The above strategy will however require averaging the value of f w along a ball of raius, which in many cases will not be containe in K (especially when is large). Therefore, we come up with a way to exten f w outsie of K in a manner that maintains the correlation with x x. 4 Information-theoretic lower boun In this section, we present the proof of Theorem 3.. The iea is to construct a function G(x), a family of convex functions {f w (x)} epening on a irection w S, such that min x G(x) 0, min x f w (x), an an approximately convex f w (x) for f w (x) such that f w (x) = G(x) except in a very small critical region epening on w. Picking w at ranom, we want to argue that the algorithm will with high probability not query the critical region. The convex boy K use in the lower boun will be arguably the simplest convex boy imaginable: the unit ball B (0). We might hope to prove a lower boun for even a linear function f w for a start, similarly as in [SV5]. A reasonable caniate construction is the following: we set f w (x) = w, x for some ranom chosen unit vector w an efine f(x) = 0 when x, w log x an f(x) = f w (x) otherwise. 6 6 For the proof sketch only, to maintain ease of reaing all of the inequalities we state will be only correct up to constants. In the actual proofs we will be completely formal. 4

5 Observe, this translates to = log. It s a stanar concentration of measure fact that for most of the points x in the unit ball, x, w log x. This implies that any algorithm that makes a polynomial number of queries to f will with high probability see 0 in all of the queries, but clearly min f(x) =. However, this iea fails to generalize to optimal range as = is tight for linear, even smooth functions. 7 In orer to obtain the optimal boun, we nee to moify the construction to a non-linear, non-smooth function. We will, in a certain sense, hie a ranom linear function insie a non-linear function. For a ranom unit vector w, we consier two regions insie the unit ball: a core C = B r (0) for r = max{, }, an a critical angle A = {x x, w log x }. The convex function f will look like x +α for some α > 0 outsie C A an w, x for x C A. We construct f as f = f when f(x) is sufficiently large (e.g. f(x) > ) an otherwise. Clearly, such f obtain its minimal at point w, with f(w) =. However, since f = x +α outsie C or A, the algorithm nees either query A or query C A c to etect w. The former happens with exponentially log small probability in high imensions, an for any x C A c, f(x) = w, x log x r max{, } log, which implies that f(x) =. Therefore, the algorithm will fail with high probability. Now, we move on to the etaile of the constructions. We will consier K = B (0): the ball of raius in R centere at The family {f w (x)} Before elving into the construction we nee the following efinition: Definition 4. (Lower Convex Envelope (LCE)). Given a set S R, a function F : S R, efine the lower convex envelope F LCE = LCE(F ) as a function F LCE : R R such that for every x R, F LCE (x) = max{ x y, F (y) + F (y)} y S Proposition 4.. LCE(F ) is convex. Proof. LCE(F) is the pointwise maximum of linear functions, so the claim follows. Remark : The LCE of a function F is a function efine over the entire R, while the input function F is only efine in a set S (not necessarily convex set). When the input function F is convex, LCE(F ) can be consiere as an extension of F to the entire R. To efine the family f w (x), we will nee four parameters: a power factor α > 0, a shrinking factor β, an a raius factor γ > 0, an a vector w R such that w =, which we specify in a short bit. Construction 4.. Given w, α, β, γ, efine the core C = B γ (0), the critical angle A = {x x, w β x } an let H = K C A. Let h : H R be efine as h(x) = x +α an efine l w (x) = 8 x, w. Finally let f w : K R as } f w (x) = max { hlce (x), l w (x) Where h LCE = LCE( h) as in Definition 4.. We then construct the har function f w as the following: Construction 4.. Consier the function f w : K R: { fw (x) if x K ( C A ) ; f w (x) = max{f w (x), } otherwise. 7 This follows from the results in [DKS4] 8 We pick B (0) instea of the unit ball in orer to ensure the iameter is. 5

6 Consier the following settings of the parameters β, γ, α (epening on the magnitue of ): Case, c log (log ) : β =, γ = 0c(log ).5, α = log(/γ). Case, : β = c log /, γ = 0c (log /) 3/, α = log(/γ). Case 3, 64 (log ) : β = c log, γ =, α =. Then, the we formalize the proof intuition from the previous section with the following claims. Following the the proof outline, we first show the minimum of f w is small, in particular we will show f w (w). Lemma 4.. f w (w) = Finally, we show that f w is inee a -approximately convex, by showing x K, f w f w an f w is -Lipschitz an convex. Proposition 4.. fw is a -approximately convex. Next, we construct G(x), which oes not epen on w, we want to show that for an algorithm with small number of queries of f w, it can not istinguish f w from this function. Construction 4.3. Let G : K R be efine as: { { max +α G(x) = 4 x α 4 γ, } if x K C ; otherwise. The following is true: x +α Lemma 4.. G(x) 0 an {x K G(x) f w (x)} A We show how Theorem 3. is implie given these statements: Proof of Theorem 3.. With everything prior to this set up, the final claim is somewhat stanar. We want to show that no algorithm can, with probability, output a point x, s.t. fw (x) min x fw (x) +. Since we know that f w (x) agrees with G(x) everywhere except in K A, an G(x) satisfies min x G(x) min x fw (x) +, we only nee to show that with high probability, any polynomial time algorithm will not query any point in K A. Consier a (potentially) ranomize algorithm A, making ranom choices R, R,..., R m. Conitione on a particular choice of ranomness r, r,..., r m, for a ranom choice of w, each r i lies in A with probability at most exp( c log(/)), by a stanar Gaussian tail boun. Union bouning, since m = o(( )c ) for an algorithm that runs in time o(( )c ), the probability that at least of the queries of A lies in A is at most. But the claim is true for any choice r, r,..., r m of the ranomness, by averaging, the claim hols for r, r,..., r m being sample accoring to the ranomness of the algorithm. The proofs of all of the lemmas above have been ommite ue to space constraints, an are inclue in the appenix in full. 5 Algorithmic upper boun As mentione before, the algorithm in [BLNR5] covers the case when = O( ), so we only nee to give an algorithm when = Ω( ) an = O( ). Our approach will not be making use of simulate annealing, but a more robust version of graient escent. The intuition comes from [FKM05] who use estimates of the graient of a convex function erive from Stokes formula: [ ] E w S f(x + rw)w = f(x)x r 6 B

7 where w S enotes w being a uniform sample from the sphere S. Our observation is the graient estimation is robust to noise if we instea use f in the left han sie. Crucially, robust is not in the sense that it approximates the graient of f, but it preserves the crucial property of the graient of f we nee: f(x), x x f(x) f(x ). In wors, this means if we move x at irection f(x) for a small step, then x will be closer to x, an we will show the property is preserve by f when. Inee, we have that: [ ] E w S r f(x + rw)w, x x [ ] E w S r f(x + rw)w, x x r E w S [ w, x x ] The usual [FKM05] calculation [ shows that ] E w S ) r f(x + rw)w, x x = Ω (f(x) f(x ) r) an r E w U(S ) [ w, x x ] is boune by O( r ), since E w U(S ) [ w, x x ] = O( ). Therefore, we want f(x) f(x ) r whenever f(x) f(x ). Choosing the optimal parameter leas to r = 4 an. r This intuitive calculation basically proves the simple upper boun guarantee (Theorem 3.3). On the other han, the argument requires sampling from a ball of raius Ω() aroun point x. This is problematic when > : many convex boies (e.g. the simplex, L ball after rescaling to iameter one) will not contain a ball of raius even. The iea is then to make the sampling possible by extening f outsie of K. Namely, we efine a new function g : R R such that (Π K (x) is the projection of x to K) g(x) = f(π K (x)) + (x, K) g(x) will not be in general convex, but we instea irectly boun E w [ r g(x + rw)w], x x for x K an show that it behaves like f(x), x x f(x) f(x ). Algorithm Noisy Convex Optimization : Input: A convex set K R with iam(k) = an 0 K. A -approximate convex function f : Define: g : R R as: g(x) = f(π K (x)) + (x, K) where Π K is the projection to K an (x, K) is the Eucliean istance from x to K. 3: Initial: x = 0, r = 8µ, η = , T = : for t =,,..., T o 5: Let v t = f(x t ). 6: Estimate up to accuracy in l norm (by uniformly ranomly sample w): [ ] g t = E w S r g(x t + rw)w where w S means w is uniform sample from the unit sphere. 7: Upate x t+ = Π K (x t ηg t ) 8: en for 9: Output min t [T ] {v t } The rest of this section will be eicate to showing the following main lemma for Algorithm. Lemma 5. (Main, algorithm). Suppose < 6348 x K such that f(x ) < f(x t ), then g t, x x t 64 7, we have: For every t [T ], if there exists

8 Assuming this Lemma, we can prove Theorem 3.. Proof of Theorem 3.. We first focus on the number of iterations: For every t, suppose f(x ) < f(x t ), then we have: (since g t /r 56 ) x x t+ x (x t ηg t ) = x x t η x x t, g t + η g t x x t η η 64 x x t = x x t Since originally x x, the algorithm ens in poly(, ) iterations. Now we consier the sample complexity. Since we know that r g(x t + rw)w 64 By stanar concentration boun we know that we nee poly(, ) samples to estimate the expectation up to error per iteration Due to space constraints, we forwar the proof of Lemma 5. to the appenix. 6 Discussion an open problems 6. Arbitrary Lipschitz constants an iameter We assume throughout the paper that the convex function f is -Lipschitz an the convex set K has iameter. Our results can be easily extene to arbitrary functions an convex sets through a simple linear transformation. For f with Lipschitz constant f Lip an K with iameter D, an the corresponing approximately convex f, efine g : K D R as g(x) = D f Lip f(rx). (Where K D is the rescaling of K by a factor of D.) This translates to g(x) g(x) R f Lip. But g(x) = f(rx) R f Lip is -Lipschitz over a set K R of iameter. Therefore, for general functions over a general convex sets, our result trivially implies the rate for being { able to optimize approximately-convex functions is ( ) } = max, R f Lip R f Lip R f Lip { } which simplifies to = max. 6. Boy specific bouns R f Lip, Our algorithmic result matches the lower boun on well-conitione boies. The natural open problem is to resolve the problem for arbitrary boies. 9 Also note the lower boun can not hol for any convex boy K in R : for example, if K is just a one imensional line in R, then the threshol shoul not epen on at all. But even when the inherent imension of K is, the result is still boy specific: one can show that for f over the simplex in R, when, it is possible to optimize f in polynomial time even when is as large as. 0 Finally, while our algorithm mae use of the well-conitioning what is the correct property/parameter of the convex boy that governs the rate of T () is a tantalizing question to explore in future work. 9 We o not show it here, but one can prove the upp/lower boun still hols over the hypercube an when one can fin a ball of raius that has most of the mass in the convex boy K. 0 Again, we o not show that here, but essentially one can search through the + lines from the center to the + corners. 8

9 References [AD0] Alekh Agarwal an Ofer Dekel. Optimal algorithms for online convex optimization with multi-point banit feeback. In COLT, pages Citeseer, 00. [AFH + ] Alekh Agarwal, Dean P Foster, Daniel J Hsu, Sham M Kakae, an Alexaner Rakhlin. Stochastic convex optimization with banit feeback. In Avances in Neural Information Processing Systems, pages , 0. [BLNR5] Alexanre Belloni, Tengyuan Liang, Hariharan Narayanan, an Alexaner Rakhlin. Escaping the local minima via simulate annealing: Optimization of approximately convex functions. In Proceeings of The 8th Conference on Learning Theory, pages 40 65, 05. [DJWW5] John C Duchi, Michael I Joran, Martin J Wainwright, an Anre Wibisono. Optimal rates for zero-orer convex optimization: The power of two function evaluations. Information Theory, IEEE Transactions on, 6(5): , 05. [DKS4] Martin Dyer, Ravi Kannan, an Leen Stougie. A simple ranomise algorithm for convex optimisation. Mathematical Programming, 47(-):07 9, 04. [FKM05] Abraham D Flaxman, Aam Tauman Kalai, an H Brenan McMahan. Online convex optimization in the banit setting: graient escent without a graient. In Proceeings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms, pages Society for Inustrial an Applie Mathematics, 005. [NS] Yurii Nesterov an Vlaimir Spokoiny. Ranom graient-free minimization of convex functions. Founations of Computational Mathematics, pages 40. [NY83] Arkaii Nemirovskii an Davi Borisovich Yuin. Problem complexity an metho efficiency in optimization. Wiley-Interscience series in iscrete mathematics. Wiley, Chichester, New York, 983. A Wiley-Interscience publication. [Sha] Oha Shamir. On the complexity of banit an erivative-free stochastic convex optimization. arxiv preprint arxiv:09.388, 0. [SV5] Yaron Singer an Jan Vonrák. Information-theoretic lower bouns for convex optimization with erroneous oracles. In Avances in Neural Information Processing Systems, pages , 05. 9

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback

An Optimal Algorithm for Bandit and Zero-Order Convex Optimization with Two-Point Feedback Journal of Machine Learning Research 8 07) - Submitte /6; Publishe 5/7 An Optimal Algorithm for Banit an Zero-Orer Convex Optimization with wo-point Feeback Oha Shamir Department of Computer Science an

More information

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012

Lecture Introduction. 2 Examples of Measure Concentration. 3 The Johnson-Lindenstrauss Lemma. CS-621 Theory Gems November 28, 2012 CS-6 Theory Gems November 8, 0 Lecture Lecturer: Alesaner Mąry Scribes: Alhussein Fawzi, Dorina Thanou Introuction Toay, we will briefly iscuss an important technique in probability theory measure concentration

More information

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks A PAC-Bayesian Approach to Spectrally-Normalize Margin Bouns for Neural Networks Behnam Neyshabur, Srinah Bhojanapalli, Davi McAllester, Nathan Srebro Toyota Technological Institute at Chicago {bneyshabur,

More information

Least-Squares Regression on Sparse Spaces

Least-Squares Regression on Sparse Spaces Least-Squares Regression on Sparse Spaces Yuri Grinberg, Mahi Milani Far, Joelle Pineau School of Computer Science McGill University Montreal, Canaa {ygrinb,mmilan1,jpineau}@cs.mcgill.ca 1 Introuction

More information

Lower Bounds for the Smoothed Number of Pareto optimal Solutions

Lower Bounds for the Smoothed Number of Pareto optimal Solutions Lower Bouns for the Smoothe Number of Pareto optimal Solutions Tobias Brunsch an Heiko Röglin Department of Computer Science, University of Bonn, Germany brunsch@cs.uni-bonn.e, heiko@roeglin.org Abstract.

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 12 EFFICIENT LEARNING So far, our focus has been on moels of learning an basic algorithms for those moels. We have not place much emphasis on how to learn quickly.

More information

26.1 Metropolis method

26.1 Metropolis method CS880: Approximations Algorithms Scribe: Dave Anrzejewski Lecturer: Shuchi Chawla Topic: Metropolis metho, volume estimation Date: 4/26/07 The previous lecture iscusse they some of the key concepts of

More information

On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization

On the Complexity of Bandit and Derivative-Free Stochastic Convex Optimization JMLR: Workshop an Conference Proceeings vol 30 013) 1 On the Complexity of Banit an Derivative-Free Stochastic Convex Optimization Oha Shamir Microsoft Research an the Weizmann Institute of Science oha.shamir@weizmann.ac.il

More information

All s Well That Ends Well: Supplementary Proofs

All s Well That Ends Well: Supplementary Proofs All s Well That Ens Well: Guarantee Resolution of Simultaneous Rigi Boy Impact 1:1 All s Well That Ens Well: Supplementary Proofs This ocument complements the paper All s Well That Ens Well: Guarantee

More information

Separation of Variables

Separation of Variables Physics 342 Lecture 1 Separation of Variables Lecture 1 Physics 342 Quantum Mechanics I Monay, January 25th, 2010 There are three basic mathematical tools we nee, an then we can begin working on the physical

More information

Analyzing Tensor Power Method Dynamics in Overcomplete Regime

Analyzing Tensor Power Method Dynamics in Overcomplete Regime Journal of Machine Learning Research 18 (2017) 1-40 Submitte 9/15; Revise 11/16; Publishe 4/17 Analyzing Tensor Power Metho Dynamics in Overcomplete Regime Animashree Ananumar Department of Electrical

More information

arxiv: v4 [cs.ds] 7 Mar 2014

arxiv: v4 [cs.ds] 7 Mar 2014 Analysis of Agglomerative Clustering Marcel R. Ackermann Johannes Blömer Daniel Kuntze Christian Sohler arxiv:101.697v [cs.ds] 7 Mar 01 Abstract The iameter k-clustering problem is the problem of partitioning

More information

A Sketch of Menshikov s Theorem

A Sketch of Menshikov s Theorem A Sketch of Menshikov s Theorem Thomas Bao March 14, 2010 Abstract Let Λ be an infinite, locally finite oriente multi-graph with C Λ finite an strongly connecte, an let p

More information

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x)

The derivative of a function f(x) is another function, defined in terms of a limiting expression: f(x + δx) f(x) Y. D. Chong (2016) MH2801: Complex Methos for the Sciences 1. Derivatives The erivative of a function f(x) is another function, efine in terms of a limiting expression: f (x) f (x) lim x δx 0 f(x + δx)

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Technical Report TTI-TR-2008-5 Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri UC San Diego Sham M. Kakae Toyota Technological Institute at Chicago ABSTRACT Clustering ata in

More information

6 General properties of an autonomous system of two first order ODE

6 General properties of an autonomous system of two first order ODE 6 General properties of an autonomous system of two first orer ODE Here we embark on stuying the autonomous system of two first orer ifferential equations of the form ẋ 1 = f 1 (, x 2 ), ẋ 2 = f 2 (, x

More information

Admin BACKPROPAGATION. Neural network. Neural network 11/3/16. Assignment 7. Assignment 8 Goals today. David Kauchak CS158 Fall 2016

Admin BACKPROPAGATION. Neural network. Neural network 11/3/16. Assignment 7. Assignment 8 Goals today. David Kauchak CS158 Fall 2016 Amin Assignment 7 Assignment 8 Goals toay BACKPROPAGATION Davi Kauchak CS58 Fall 206 Neural network Neural network inputs inputs some inputs are provie/ entere Iniviual perceptrons/ neurons Neural network

More information

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments 2 Conference on Information Sciences an Systems, The Johns Hopkins University, March 2, 2 Time-of-Arrival Estimation in Non-Line-Of-Sight Environments Sinan Gezici, Hisashi Kobayashi an H. Vincent Poor

More information

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions Working Paper 2013:5 Department of Statistics Computing Exact Confience Coefficients of Simultaneous Confience Intervals for Multinomial Proportions an their Functions Shaobo Jin Working Paper 2013:5

More information

arxiv: v2 [cs.ds] 11 May 2016

arxiv: v2 [cs.ds] 11 May 2016 Optimizing Star-Convex Functions Jasper C.H. Lee Paul Valiant arxiv:5.04466v2 [cs.ds] May 206 Department of Computer Science Brown University {jasperchlee,paul_valiant}@brown.eu May 3, 206 Abstract We

More information

Vectors in two dimensions

Vectors in two dimensions Vectors in two imensions Until now, we have been working in one imension only The main reason for this is to become familiar with the main physical ieas like Newton s secon law, without the aitional complication

More information

Lower bounds on Locality Sensitive Hashing

Lower bounds on Locality Sensitive Hashing Lower bouns on Locality Sensitive Hashing Rajeev Motwani Assaf Naor Rina Panigrahy Abstract Given a metric space (X, X ), c 1, r > 0, an p, q [0, 1], a istribution over mappings H : X N is calle a (r,

More information

PDE Notes, Lecture #11

PDE Notes, Lecture #11 PDE Notes, Lecture # from Professor Jalal Shatah s Lectures Febuary 9th, 2009 Sobolev Spaces Recall that for u L loc we can efine the weak erivative Du by Du, φ := udφ φ C0 If v L loc such that Du, φ =

More information

Proof of SPNs as Mixture of Trees

Proof of SPNs as Mixture of Trees A Proof of SPNs as Mixture of Trees Theorem 1. If T is an inuce SPN from a complete an ecomposable SPN S, then T is a tree that is complete an ecomposable. Proof. Argue by contraiction that T is not a

More information

Lecture 6 : Dimensionality Reduction

Lecture 6 : Dimensionality Reduction CPS290: Algorithmic Founations of Data Science February 3, 207 Lecture 6 : Dimensionality Reuction Lecturer: Kamesh Munagala Scribe: Kamesh Munagala In this lecture, we will consier the roblem of maing

More information

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013 Survey Sampling Kosuke Imai Department of Politics, Princeton University February 19, 2013 Survey sampling is one of the most commonly use ata collection methos for social scientists. We begin by escribing

More information

Linear Regression with Limited Observation

Linear Regression with Limited Observation Ela Hazan Tomer Koren Technion Israel Institute of Technology, Technion City 32000, Haifa, Israel ehazan@ie.technion.ac.il tomerk@cs.technion.ac.il Abstract We consier the most common variants of linear

More information

Linear First-Order Equations

Linear First-Order Equations 5 Linear First-Orer Equations Linear first-orer ifferential equations make up another important class of ifferential equations that commonly arise in applications an are relatively easy to solve (in theory)

More information

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs Lectures - Week 10 Introuction to Orinary Differential Equations (ODES) First Orer Linear ODEs When stuying ODEs we are consiering functions of one inepenent variable, e.g., f(x), where x is the inepenent

More information

Agmon Kolmogorov Inequalities on l 2 (Z d )

Agmon Kolmogorov Inequalities on l 2 (Z d ) Journal of Mathematics Research; Vol. 6, No. ; 04 ISSN 96-9795 E-ISSN 96-9809 Publishe by Canaian Center of Science an Eucation Agmon Kolmogorov Inequalities on l (Z ) Arman Sahovic Mathematics Department,

More information

Table of Common Derivatives By David Abraham

Table of Common Derivatives By David Abraham Prouct an Quotient Rules: Table of Common Derivatives By Davi Abraham [ f ( g( ] = [ f ( ] g( + f ( [ g( ] f ( = g( [ f ( ] g( g( f ( [ g( ] Trigonometric Functions: sin( = cos( cos( = sin( tan( = sec

More information

Leaving Randomness to Nature: d-dimensional Product Codes through the lens of Generalized-LDPC codes

Leaving Randomness to Nature: d-dimensional Product Codes through the lens of Generalized-LDPC codes Leaving Ranomness to Nature: -Dimensional Prouct Coes through the lens of Generalize-LDPC coes Tavor Baharav, Kannan Ramchanran Dept. of Electrical Engineering an Computer Sciences, U.C. Berkeley {tavorb,

More information

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors

Math Notes on differentials, the Chain Rule, gradients, directional derivative, and normal vectors Math 18.02 Notes on ifferentials, the Chain Rule, graients, irectional erivative, an normal vectors Tangent plane an linear approximation We efine the partial erivatives of f( xy, ) as follows: f f( x+

More information

BIRS Ron Peled (Tel Aviv University) Portions joint with Ohad N. Feldheim (Tel Aviv University)

BIRS Ron Peled (Tel Aviv University) Portions joint with Ohad N. Feldheim (Tel Aviv University) 2.6.2011 BIRS Ron Pele (Tel Aviv University) Portions joint with Oha N. Felheim (Tel Aviv University) Surface: f : Λ Hamiltonian: H(f) DGFF: V f(x) f(y) x~y V(x) x 2. R or for f a Λ a box Ranom surface:

More information

CHAPTER 1 : DIFFERENTIABLE MANIFOLDS. 1.1 The definition of a differentiable manifold

CHAPTER 1 : DIFFERENTIABLE MANIFOLDS. 1.1 The definition of a differentiable manifold CHAPTER 1 : DIFFERENTIABLE MANIFOLDS 1.1 The efinition of a ifferentiable manifol Let M be a topological space. This means that we have a family Ω of open sets efine on M. These satisfy (1), M Ω (2) the

More information

Acute sets in Euclidean spaces

Acute sets in Euclidean spaces Acute sets in Eucliean spaces Viktor Harangi April, 011 Abstract A finite set H in R is calle an acute set if any angle etermine by three points of H is acute. We examine the maximal carinality α() of

More information

Construction of the Electronic Radial Wave Functions and Probability Distributions of Hydrogen-like Systems

Construction of the Electronic Radial Wave Functions and Probability Distributions of Hydrogen-like Systems Construction of the Electronic Raial Wave Functions an Probability Distributions of Hyrogen-like Systems Thomas S. Kuntzleman, Department of Chemistry Spring Arbor University, Spring Arbor MI 498 tkuntzle@arbor.eu

More information

Lagrangian and Hamiltonian Mechanics

Lagrangian and Hamiltonian Mechanics Lagrangian an Hamiltonian Mechanics.G. Simpson, Ph.. epartment of Physical Sciences an Engineering Prince George s Community College ecember 5, 007 Introuction In this course we have been stuying classical

More information

Lecture 6: Calculus. In Song Kim. September 7, 2011

Lecture 6: Calculus. In Song Kim. September 7, 2011 Lecture 6: Calculus In Song Kim September 7, 20 Introuction to Differential Calculus In our previous lecture we came up with several ways to analyze functions. We saw previously that the slope of a linear

More information

Convergence of Random Walks

Convergence of Random Walks Chapter 16 Convergence of Ranom Walks This lecture examines the convergence of ranom walks to the Wiener process. This is very important both physically an statistically, an illustrates the utility of

More information

Approximate Constraint Satisfaction Requires Large LP Relaxations

Approximate Constraint Satisfaction Requires Large LP Relaxations Approximate Constraint Satisfaction Requires Large LP Relaxations oah Fleming April 19, 2018 Linear programming is a very powerful tool for attacking optimization problems. Techniques such as the ellipsoi

More information

u!i = a T u = 0. Then S satisfies

u!i = a T u = 0. Then S satisfies Deterministic Conitions for Subspace Ientifiability from Incomplete Sampling Daniel L Pimentel-Alarcón, Nigel Boston, Robert D Nowak University of Wisconsin-Maison Abstract Consier an r-imensional subspace

More information

Topic 7: Convergence of Random Variables

Topic 7: Convergence of Random Variables Topic 7: Convergence of Ranom Variables Course 003, 2016 Page 0 The Inference Problem So far, our starting point has been a given probability space (S, F, P). We now look at how to generate information

More information

Physics 505 Electricity and Magnetism Fall 2003 Prof. G. Raithel. Problem Set 3. 2 (x x ) 2 + (y y ) 2 + (z + z ) 2

Physics 505 Electricity and Magnetism Fall 2003 Prof. G. Raithel. Problem Set 3. 2 (x x ) 2 + (y y ) 2 + (z + z ) 2 Physics 505 Electricity an Magnetism Fall 003 Prof. G. Raithel Problem Set 3 Problem.7 5 Points a): Green s function: Using cartesian coorinates x = (x, y, z), it is G(x, x ) = 1 (x x ) + (y y ) + (z z

More information

Schrödinger s equation.

Schrödinger s equation. Physics 342 Lecture 5 Schröinger s Equation Lecture 5 Physics 342 Quantum Mechanics I Wenesay, February 3r, 2010 Toay we iscuss Schröinger s equation an show that it supports the basic interpretation of

More information

On the Surprising Behavior of Distance Metrics in High Dimensional Space

On the Surprising Behavior of Distance Metrics in High Dimensional Space On the Surprising Behavior of Distance Metrics in High Dimensional Space Charu C. Aggarwal, Alexaner Hinneburg 2, an Daniel A. Keim 2 IBM T. J. Watson Research Center Yortown Heights, NY 0598, USA. charu@watson.ibm.com

More information

ensembles When working with density operators, we can use this connection to define a generalized Bloch vector: v x Tr x, v y Tr y

ensembles When working with density operators, we can use this connection to define a generalized Bloch vector: v x Tr x, v y Tr y Ph195a lecture notes, 1/3/01 Density operators for spin- 1 ensembles So far in our iscussion of spin- 1 systems, we have restricte our attention to the case of pure states an Hamiltonian evolution. Toay

More information

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k

Robust Forward Algorithms via PAC-Bayes and Laplace Distributions. ω Q. Pr (y(ω x) < 0) = Pr A k A Proof of Lemma 2 B Proof of Lemma 3 Proof: Since the support of LL istributions is R, two such istributions are equivalent absolutely continuous with respect to each other an the ivergence is well-efine

More information

Optimization of Geometries by Energy Minimization

Optimization of Geometries by Energy Minimization Optimization of Geometries by Energy Minimization by Tracy P. Hamilton Department of Chemistry University of Alabama at Birmingham Birmingham, AL 3594-140 hamilton@uab.eu Copyright Tracy P. Hamilton, 1997.

More information

Lower Bounds for Local Monotonicity Reconstruction from Transitive-Closure Spanners

Lower Bounds for Local Monotonicity Reconstruction from Transitive-Closure Spanners Lower Bouns for Local Monotonicity Reconstruction from Transitive-Closure Spanners Arnab Bhattacharyya Elena Grigorescu Mahav Jha Kyomin Jung Sofya Raskhonikova Davi P. Wooruff Abstract Given a irecte

More information

Parameter estimation: A new approach to weighting a priori information

Parameter estimation: A new approach to weighting a priori information Parameter estimation: A new approach to weighting a priori information J.L. Mea Department of Mathematics, Boise State University, Boise, ID 83725-555 E-mail: jmea@boisestate.eu Abstract. We propose a

More information

LECTURE NOTES ON DVORETZKY S THEOREM

LECTURE NOTES ON DVORETZKY S THEOREM LECTURE NOTES ON DVORETZKY S THEOREM STEVEN HEILMAN Abstract. We present the first half of the paper [S]. In particular, the results below, unless otherwise state, shoul be attribute to G. Schechtman.

More information

II. First variation of functionals

II. First variation of functionals II. First variation of functionals The erivative of a function being zero is a necessary conition for the etremum of that function in orinary calculus. Let us now tackle the question of the equivalent

More information

Tractability results for weighted Banach spaces of smooth functions

Tractability results for weighted Banach spaces of smooth functions Tractability results for weighte Banach spaces of smooth functions Markus Weimar Mathematisches Institut, Universität Jena Ernst-Abbe-Platz 2, 07740 Jena, Germany email: markus.weimar@uni-jena.e March

More information

Math 342 Partial Differential Equations «Viktor Grigoryan

Math 342 Partial Differential Equations «Viktor Grigoryan Math 342 Partial Differential Equations «Viktor Grigoryan 6 Wave equation: solution In this lecture we will solve the wave equation on the entire real line x R. This correspons to a string of infinite

More information

Quantum Mechanics in Three Dimensions

Quantum Mechanics in Three Dimensions Physics 342 Lecture 20 Quantum Mechanics in Three Dimensions Lecture 20 Physics 342 Quantum Mechanics I Monay, March 24th, 2008 We begin our spherical solutions with the simplest possible case zero potential.

More information

The Exact Form and General Integrating Factors

The Exact Form and General Integrating Factors 7 The Exact Form an General Integrating Factors In the previous chapters, we ve seen how separable an linear ifferential equations can be solve using methos for converting them to forms that can be easily

More information

Calculus and optimization

Calculus and optimization Calculus an optimization These notes essentially correspon to mathematical appenix 2 in the text. 1 Functions of a single variable Now that we have e ne functions we turn our attention to calculus. A function

More information

A new proof of the sharpness of the phase transition for Bernoulli percolation on Z d

A new proof of the sharpness of the phase transition for Bernoulli percolation on Z d A new proof of the sharpness of the phase transition for Bernoulli percolation on Z Hugo Duminil-Copin an Vincent Tassion October 8, 205 Abstract We provie a new proof of the sharpness of the phase transition

More information

05 The Continuum Limit and the Wave Equation

05 The Continuum Limit and the Wave Equation Utah State University DigitalCommons@USU Founations of Wave Phenomena Physics, Department of 1-1-2004 05 The Continuum Limit an the Wave Equation Charles G. Torre Department of Physics, Utah State University,

More information

A study on ant colony systems with fuzzy pheromone dispersion

A study on ant colony systems with fuzzy pheromone dispersion A stuy on ant colony systems with fuzzy pheromone ispersion Louis Gacogne LIP6 104, Av. Kenney, 75016 Paris, France gacogne@lip6.fr Sanra Sanri IIIA/CSIC Campus UAB, 08193 Bellaterra, Spain sanri@iiia.csic.es

More information

Upper and Lower Bounds on ε-approximate Degree of AND n and OR n Using Chebyshev Polynomials

Upper and Lower Bounds on ε-approximate Degree of AND n and OR n Using Chebyshev Polynomials Upper an Lower Bouns on ε-approximate Degree of AND n an OR n Using Chebyshev Polynomials Mrinalkanti Ghosh, Rachit Nimavat December 11, 016 1 Introuction The notion of approximate egree was first introuce

More information

FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS. 1. Introduction

FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS. 1. Introduction FLUCTUATIONS IN THE NUMBER OF POINTS ON SMOOTH PLANE CURVES OVER FINITE FIELDS ALINA BUCUR, CHANTAL DAVID, BROOKE FEIGON, MATILDE LALÍN 1 Introuction In this note, we stuy the fluctuations in the number

More information

Necessary and Sufficient Conditions for Sketched Subspace Clustering

Necessary and Sufficient Conditions for Sketched Subspace Clustering Necessary an Sufficient Conitions for Sketche Subspace Clustering Daniel Pimentel-Alarcón, Laura Balzano 2, Robert Nowak University of Wisconsin-Maison, 2 University of Michigan-Ann Arbor Abstract This

More information

Thermal conductivity of graded composites: Numerical simulations and an effective medium approximation

Thermal conductivity of graded composites: Numerical simulations and an effective medium approximation JOURNAL OF MATERIALS SCIENCE 34 (999)5497 5503 Thermal conuctivity of grae composites: Numerical simulations an an effective meium approximation P. M. HUI Department of Physics, The Chinese University

More information

inflow outflow Part I. Regular tasks for MAE598/494 Task 1

inflow outflow Part I. Regular tasks for MAE598/494 Task 1 MAE 494/598, Fall 2016 Project #1 (Regular tasks = 20 points) Har copy of report is ue at the start of class on the ue ate. The rules on collaboration will be release separately. Please always follow the

More information

Diophantine Approximations: Examining the Farey Process and its Method on Producing Best Approximations

Diophantine Approximations: Examining the Farey Process and its Method on Producing Best Approximations Diophantine Approximations: Examining the Farey Process an its Metho on Proucing Best Approximations Kelly Bowen Introuction When a person hears the phrase irrational number, one oes not think of anything

More information

Differentiation ( , 9.5)

Differentiation ( , 9.5) Chapter 2 Differentiation (8.1 8.3, 9.5) 2.1 Rate of Change (8.2.1 5) Recall that the equation of a straight line can be written as y = mx + c, where m is the slope or graient of the line, an c is the

More information

Introduction to Machine Learning

Introduction to Machine Learning How o you estimate p(y x)? Outline Contents Introuction to Machine Learning Logistic Regression Varun Chanola April 9, 207 Generative vs. Discriminative Classifiers 2 Logistic Regression 2 3 Logistic Regression

More information

ELECTRON DIFFRACTION

ELECTRON DIFFRACTION ELECTRON DIFFRACTION Electrons : wave or quanta? Measurement of wavelength an momentum of electrons. Introuction Electrons isplay both wave an particle properties. What is the relationship between the

More information

Fractional Geometric Calculus: Toward A Unified Mathematical Language for Physics and Engineering

Fractional Geometric Calculus: Toward A Unified Mathematical Language for Physics and Engineering Fractional Geometric Calculus: Towar A Unifie Mathematical Language for Physics an Engineering Xiong Wang Center of Chaos an Complex Network, Department of Electronic Engineering, City University of Hong

More information

Switching Time Optimization in Discretized Hybrid Dynamical Systems

Switching Time Optimization in Discretized Hybrid Dynamical Systems Switching Time Optimization in Discretize Hybri Dynamical Systems Kathrin Flaßkamp, To Murphey, an Sina Ober-Blöbaum Abstract Switching time optimization (STO) arises in systems that have a finite set

More information

Make graph of g by adding c to the y-values. on the graph of f by c. multiplying the y-values. even-degree polynomial. graph goes up on both sides

Make graph of g by adding c to the y-values. on the graph of f by c. multiplying the y-values. even-degree polynomial. graph goes up on both sides Reference 1: Transformations of Graphs an En Behavior of Polynomial Graphs Transformations of graphs aitive constant constant on the outsie g(x) = + c Make graph of g by aing c to the y-values on the graph

More information

Introduction to variational calculus: Lecture notes 1

Introduction to variational calculus: Lecture notes 1 October 10, 2006 Introuction to variational calculus: Lecture notes 1 Ewin Langmann Mathematical Physics, KTH Physics, AlbaNova, SE-106 91 Stockholm, Sween Abstract I give an informal summary of variational

More information

Cascaded redundancy reduction

Cascaded redundancy reduction Network: Comput. Neural Syst. 9 (1998) 73 84. Printe in the UK PII: S0954-898X(98)88342-5 Cascae reunancy reuction Virginia R e Sa an Geoffrey E Hinton Department of Computer Science, University of Toronto,

More information

Generalizing Kronecker Graphs in order to Model Searchable Networks

Generalizing Kronecker Graphs in order to Model Searchable Networks Generalizing Kronecker Graphs in orer to Moel Searchable Networks Elizabeth Boine, Babak Hassibi, Aam Wierman California Institute of Technology Pasaena, CA 925 Email: {eaboine, hassibi, aamw}@caltecheu

More information

Survival exponents for fractional Brownian motion with multivariate time

Survival exponents for fractional Brownian motion with multivariate time Survival exponents for fractional Brownian motion with multivariate time G Molchan Institute of Earthquae Preiction heory an Mathematical Geophysics Russian Acaemy of Science 84/3 Profsoyuznaya st 7997

More information

McMaster University. Advanced Optimization Laboratory. Title: The Central Path Visits all the Vertices of the Klee-Minty Cube.

McMaster University. Advanced Optimization Laboratory. Title: The Central Path Visits all the Vertices of the Klee-Minty Cube. McMaster University Avance Optimization Laboratory Title: The Central Path Visits all the Vertices of the Klee-Minty Cube Authors: Antoine Deza, Eissa Nematollahi, Reza Peyghami an Tamás Terlaky AvOl-Report

More information

Efficient Learning of Linear Separators under Bounded Noise

Efficient Learning of Linear Separators under Bounded Noise Efficient Learning of Linear Separators uner Boune Noise Pranjal Awasthi pawashti@csprincetoneu Maria-Florina Balcan ninamf@cscmueu Ruth Urner rurner@tuebingenmpge March 1, 15 Nika Haghtalab nhaghtal@cscmueu

More information

Lecture 2: Correlated Topic Model

Lecture 2: Correlated Topic Model Probabilistic Moels for Unsupervise Learning Spring 203 Lecture 2: Correlate Topic Moel Inference for Correlate Topic Moel Yuan Yuan First of all, let us make some claims about the parameters an variables

More information

Two formulas for the Euler ϕ-function

Two formulas for the Euler ϕ-function Two formulas for the Euler ϕ-function Robert Frieman A multiplication formula for ϕ(n) The first formula we want to prove is the following: Theorem 1. If n 1 an n 2 are relatively prime positive integers,

More information

Generalized Tractability for Multivariate Problems

Generalized Tractability for Multivariate Problems Generalize Tractability for Multivariate Problems Part II: Linear Tensor Prouct Problems, Linear Information, an Unrestricte Tractability Michael Gnewuch Department of Computer Science, University of Kiel,

More information

Collapsed Gibbs and Variational Methods for LDA. Example Collapsed MoG Sampling

Collapsed Gibbs and Variational Methods for LDA. Example Collapsed MoG Sampling Case Stuy : Document Retrieval Collapse Gibbs an Variational Methos for LDA Machine Learning/Statistics for Big Data CSE599C/STAT59, University of Washington Emily Fox 0 Emily Fox February 7 th, 0 Example

More information

Self-normalized Martingale Tail Inequality

Self-normalized Martingale Tail Inequality Online-to-Confience-Set Conversions an Application to Sparse Stochastic Banits A Self-normalize Martingale Tail Inequality The self-normalize martingale tail inequality that we present here is the scalar-value

More information

WUCHEN LI AND STANLEY OSHER

WUCHEN LI AND STANLEY OSHER CONSTRAINED DYNAMICAL OPTIMAL TRANSPORT AND ITS LAGRANGIAN FORMULATION WUCHEN LI AND STANLEY OSHER Abstract. We propose ynamical optimal transport (OT) problems constraine in a parameterize probability

More information

On the number of isolated eigenvalues of a pair of particles in a quantum wire

On the number of isolated eigenvalues of a pair of particles in a quantum wire On the number of isolate eigenvalues of a pair of particles in a quantum wire arxiv:1812.11804v1 [math-ph] 31 Dec 2018 Joachim Kerner 1 Department of Mathematics an Computer Science FernUniversität in

More information

Stable and compact finite difference schemes

Stable and compact finite difference schemes Center for Turbulence Research Annual Research Briefs 2006 2 Stable an compact finite ifference schemes By K. Mattsson, M. Svär AND M. Shoeybi. Motivation an objectives Compact secon erivatives have long

More information

Calculus of Variations

Calculus of Variations Calculus of Variations Lagrangian formalism is the main tool of theoretical classical mechanics. Calculus of Variations is a part of Mathematics which Lagrangian formalism is base on. In this section,

More information

An Analytical Expression of the Probability of Error for Relaying with Decode-and-forward

An Analytical Expression of the Probability of Error for Relaying with Decode-and-forward An Analytical Expression of the Probability of Error for Relaying with Decoe-an-forwar Alexanre Graell i Amat an Ingmar Lan Department of Electronics, Institut TELECOM-TELECOM Bretagne, Brest, France Email:

More information

Perfect Matchings in Õ(n1.5 ) Time in Regular Bipartite Graphs

Perfect Matchings in Õ(n1.5 ) Time in Regular Bipartite Graphs Perfect Matchings in Õ(n1.5 ) Time in Regular Bipartite Graphs Ashish Goel Michael Kapralov Sanjeev Khanna Abstract We consier the well-stuie problem of fining a perfect matching in -regular bipartite

More information

Equilibrium in Queues Under Unknown Service Times and Service Value

Equilibrium in Queues Under Unknown Service Times and Service Value University of Pennsylvania ScholarlyCommons Finance Papers Wharton Faculty Research 1-2014 Equilibrium in Queues Uner Unknown Service Times an Service Value Laurens Debo Senthil K. Veeraraghavan University

More information

Concentration of Measure Inequalities for Compressive Toeplitz Matrices with Applications to Detection and System Identification

Concentration of Measure Inequalities for Compressive Toeplitz Matrices with Applications to Detection and System Identification Concentration of Measure Inequalities for Compressive Toeplitz Matrices with Applications to Detection an System Ientification Borhan M Sananaji, Tyrone L Vincent, an Michael B Wakin Abstract In this paper,

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Keywors: multi-view learning, clustering, canonical correlation analysis Abstract Clustering ata in high-imensions is believe to be a har problem in general. A number of efficient clustering algorithms

More information

Lecture 2 Lagrangian formulation of classical mechanics Mechanics

Lecture 2 Lagrangian formulation of classical mechanics Mechanics Lecture Lagrangian formulation of classical mechanics 70.00 Mechanics Principle of stationary action MATH-GA To specify a motion uniquely in classical mechanics, it suffices to give, at some time t 0,

More information

Balancing Expected and Worst-Case Utility in Contracting Models with Asymmetric Information and Pooling

Balancing Expected and Worst-Case Utility in Contracting Models with Asymmetric Information and Pooling Balancing Expecte an Worst-Case Utility in Contracting Moels with Asymmetric Information an Pooling R.B.O. erkkamp & W. van en Heuvel & A.P.M. Wagelmans Econometric Institute Report EI2018-01 9th January

More information

Slide10 Haykin Chapter 14: Neurodynamics (3rd Ed. Chapter 13)

Slide10 Haykin Chapter 14: Neurodynamics (3rd Ed. Chapter 13) Slie10 Haykin Chapter 14: Neuroynamics (3r E. Chapter 13) CPSC 636-600 Instructor: Yoonsuck Choe Spring 2012 Neural Networks with Temporal Behavior Inclusion of feeback gives temporal characteristics to

More information

Function Spaces. 1 Hilbert Spaces

Function Spaces. 1 Hilbert Spaces Function Spaces A function space is a set of functions F that has some structure. Often a nonparametric regression function or classifier is chosen to lie in some function space, where the assume structure

More information

Final Exam Study Guide and Practice Problems Solutions

Final Exam Study Guide and Practice Problems Solutions Final Exam Stuy Guie an Practice Problems Solutions Note: These problems are just some of the types of problems that might appear on the exam. However, to fully prepare for the exam, in aition to making

More information

Database-friendly Random Projections

Database-friendly Random Projections Database-frienly Ranom Projections Dimitris Achlioptas Microsoft ABSTRACT A classic result of Johnson an Linenstrauss asserts that any set of n points in -imensional Eucliean space can be embee into k-imensional

More information

TEMPORAL AND TIME-FREQUENCY CORRELATION-BASED BLIND SOURCE SEPARATION METHODS. Yannick DEVILLE

TEMPORAL AND TIME-FREQUENCY CORRELATION-BASED BLIND SOURCE SEPARATION METHODS. Yannick DEVILLE TEMPORAL AND TIME-FREQUENCY CORRELATION-BASED BLIND SOURCE SEPARATION METHODS Yannick DEVILLE Université Paul Sabatier Laboratoire Acoustique, Métrologie, Instrumentation Bât. 3RB2, 8 Route e Narbonne,

More information