OPTIMAL AFFINE INVARIANT SMOOTH MINIMIZATION ALGORITHMS

Size: px
Start display at page:

Download "OPTIMAL AFFINE INVARIANT SMOOTH MINIMIZATION ALGORITHMS"

Transcription

1 1 OPTIMAL AFFINE INVARIANT SMOOTH MINIMIZATION ALGORITHMS ALEXANDRE D ASPREMONT, CRISTÓBAL GUZMÁN, AND MARTIN JAGGI ABSTRACT. We formulate an affine invariant imlementation of the accelerated first-order algorithm in [Nesterov, 1983]. Its comlexity bound is roortional to an affine invariant regularity constant defined with resect to the Minkowski gauge of the feasible set. We extend these results to more general roblems, otimizing Hölder smooth functions using -uniformly convex rox terms, and derive an algorithm whose comlexity better fits the geometry of the feasible set and adats to both the best Hölder smoothness arameter and the best gradient Lischitz constant. Finally, we detail matching comlexity lower bounds when the feasible set is an l ball. In this setting, our uer bounds on iteration comlexity for the algorithm in [Nesterov, 1983] are thus otimal in terms of target recision, smoothness and roblem dimension. 1. INTRODUCTION Here, we show how to imlement the smooth minimization algorithm described in [Nesterov, 1983, 005] so that both its iterations and its comlexity bound are invariant with resect to a change of coordinates in the roblem. We focus on a generic convex minimization roblem written minimize f(x) subject to x Q, where f is a convex function with Lischitz continuous gradient and Q is a comact convex set. Without too much loss of generality, we will assume that the interior of Q is nonemty and contains zero. When Q is sufficiently simle, in a sense that will be made recise later, Nesterov [1983] showed that this roblem can be solved with a comlexity O(1/ ε), where ε is the target recision. Furthermore, it can be shown that this comlexity bound is otimal in ε for the class of smooth roblems [Nesterov, 003]. While the deendence in 1/ ε of the comlexity bound in Nesterov [1983] is otimal in ε, the various factors in front of that bound contain arameters which can heavily vary with imlementation, i.e. the choice of norm and rox regularization function. In fact, the full uer bound on the iteration comlexity of the otimal algorithm in [Nesterov, 003] is written 8Ld(x ) (1) ε where L is the Lischitz constant of the gradient, d(x ) the value of the rox at the otimum and its strong convexity arameter, all varying with the choice of norm and rox. This means in articular that, everything else being equal, this bound is not invariant with resect to an affine change of coordinates. Arguably then, the comlexity bound varies while the intrinsic comlexity of roblem (1) remains unchanged. Otimality in ε is thus no guarantee of comutational efficiency, and a oorly arameterized otimal method can exhibit far from otimal numerical erformance. On the other hand, otimal choices of norm and rox, hence of L and d should roduce affine invariant bounds. Hence, affine invariance, besides its imlications in terms of numerical stability, can also act as a guide to otimally choose norm and rox. Here, we show how to choose an underlying norm and a rox term for the algorithm in [Nesterov, 1983, 005] which make its iterations and comlexity invariant by a change of coordinates. In Section 3, we construct the norm as the Minkowski gauge of centrally symmetric sets Q, then derive the rox using a definition of the regularity of Banach saces used by [Juditsky and Nemirovski, 008] to derive concentration 1 Part of this work was done during a ostdoctoral osition at Centrum Wiskunde & Informatica Date: November 8,

2 inequalities. These systematic choices allow us to derive an affine invariant bound on the comlexity of the algorithm in [Nesterov, 1983]. When Q is an l ball, we show that this comlexity bound is otimal for most, but not all, high dimensional regimes in (n, ε). In Section 4, we thus extend our results to much more general roblems, deriving a new algorithm to otimize Hölder smooth functions using -uniformly convex rox functions. This extends the results of [Nemirovskii and Nesterov, 1985] by incororating adativity to the Hölder continuity of the gradient, and those of [Nesterov, 015] by allowing general uniformly convex rox functions, not just strongly convex ones. These additional degrees of freedom allow us to match otimal comlexity lower bounds derived in Section 5 from [Guzmán and Nemirovski, 015] when otimizing on l balls, with adativity in the Hölder smoothness arameter and Lischitz constant as a bonus. This means that, on l -balls at least, our comlexity bounds are otimal not only in terms of target recision ε, but also in terms of smoothness and roblem dimension. This shows that, in the l setting at least, affine invariance does indeed lead to otimal comlexity.. SMOOTH OPTIMIZATION ALGORITHM We first recall the basic structure of the algorithm in [Nesterov, 1983]. While many variants of this method have been derived, we use the formulation in [Nesterov, 005]. We choose a norm and assume that the function f in roblem (1) is convex with Lischitz continuous gradient, so f(y) f(x) + f(x), y x + 1 L y x, x, y Q, () for some L > 0. We also choose a rox function d(x) for the set Q, i.e. a continuous, strongly convex function on Q with arameter (see Nesterov [003] or Hiriart-Urruty and Lemaréchal [1993] for a discussion of regularization techniques using strongly convex functions). We let x 0 be the center of Q for the rox-function d(x) so that x 0 argmin d(x), x Q assuming w.l.o.g. that d(x 0 ) = 0, we then get in articular We write T Q (x) a solution to the following subroblem T Q (x) argmin y Q d(x) 1 x x 0. (3) { f(x), y x + 1 L y x }. (4) We let y 0 T Q (x 0 ) where x 0 is defined above. We recursively define three sequences of oints: the current iterate x t, the corresonding y t = T Q (x t ), and the oints { } L t z t argmin x Q d(x) + α i [f(x i ) + f(x i ), x x i ] (5) i=0 given a ste size sequence α k 0 with α 0 (0, 1] so that x t+1 = τ t z t + (1 τ t )y t y t+1 = T Q (x t+1 ) (6) where τ t = α t+1 /A t+1 with A t = t i=0 α i. We imlicitly assume here that Q is simle enough so that the two subroblems defining y t and z t can be solved very efficiently. We have the following convergence result.

3 Theorem.1. Nesterov [005]. Suose α t = (t + 1)/ with the iterates x t, y t and z t defined in (5) and (6), then for any t 0 we have f(y t ) f(x ) 4Ld(x ) (t + 1) where x is an otimal solution to roblem (1). If ε > 0 is the target recision, Theorem.1 ensures that Algorithm 1 will converge to an ε-accurate solution in no more than 8Ld(x ) (7) ε iterations. In ractice of course, d(x ) needs to be bounded a riori and L and are often hard to evaluate. Algorithm 1 Smooth minimization. Inut: x 0, the rox center of the set Q. 1: for t = 0,..., T do : Comute f(x t ). 3: Comute y t = T Q (x t ). 4: Comute z t = argmin x Q { L d(x) + t i=0 α i[f(x i ) + f(x i ), x x i ] }. 5: Set x t+1 = τ t z t + (1 τ t )y t. 6: end for Outut: x T, y T Q. While most of the arameters in Algorithm 1 are set exlicitly, the norm and the rox function d(x) are chosen arbitrarily. In what follows, we will see that a natural choice for both makes the algorithm affine invariant. 3. AFFINE INVARIANT IMPLEMENTATION We can define an affine change of coordinates x = Ay where A R n n is a nonsingular matrix, for which the original otimization roblem in (1) is transformed so in the variable y R n, where minimize f(x) subject to x Q, becomes minimize subject to ˆf(y) y ˆQ, ˆf(y) f(ay) and ˆQ A 1 Q. (9) Unless A is athologically ill-conditioned, both roblems are equivalent and should have identical comlexity bounds and iterations. In fact, the comlexity analysis of Newton s method based on the self-concordance argument develoed in [Nesterov and Nemirovskii, 1994] roduces affine invariant comlexity bounds and the iterates themselves are invariant. Here we will show how to choose the norm and the rox function d(x) to get a similar behavior for Algorithm Choosing the Norm. We start by a few classical results and definitions. Recall that the Minkowski gauge of a set Q is defined as follows. Definition 3.1. Given Q R n containing zero, we define the Minkowski gauge of Q as γ Q (x) inf{λ 0 : x λq} with γ Q (x) = 0 when Q is unbounded in the direction x. 3 (8)

4 When Q is a comact convex, centrally symmetric set with resect to the origin and has nonemty interior, the Minkowski gauge defines a norm. We write this norm Q γ Q ( ). From now on, we will assume that the set Q is centrally symmetric or use for examle Q = Q Q (in the Minkowski sense) for the gauge when it is not (this can be imroved and extending these results to the nonsymmetric case is a classical toic in functional analysis). Note that any linear transform of a centrally symmetric convex set remains centrally symmetric. The following simle result shows why Q is otentially a good choice of norm for Algorithm 1. Lemma 3.. Suose f : R n R, Q is a centrally symmetric convex set with nonemty interior and let A R n n be a nonsingular matrix. Then f has Lischitz continuous gradient with resect to the norm Q with constant L > 0, i.e. f(y) f(x) + f(x), y x + 1 L y x Q, x, y Q, if and only if the function ˆf(w) f(aw) has Lischitz continuous gradient with resect to the norm A 1 Q with the same constant L. Proof. Let w, y Q, with y = Az and x = Aw, then is equivalent to f(y) f(x) + f(x), y x + 1 L y x Q, x, y Q, f(az) f(aw) + A T w f(aw), Az Aw + 1 L Az Aw Q, z, w A 1 Q, and, using the fact that Aw Q = w A 1 Q, this is also f(az) f(aw) + w f(aw), A 1 (Az Aw) + 1 L z w A 1 Q, z, w A 1 Q, hence the desired result. An almost identical argument shows the following analogous result for the roerty of strong convexity with resect to the norm Q and affine changes of coordinates. However, when starting from the above Lemma 3., this can also be seen as a consequence of the well-known duality between smoothness and strong convexity (see e.g. [Hiriart-Urruty and Lemaréchal, 1993, Cha. X, Theorem 4..1]). Theorem 3.3. Let f : Q R be a convex l.s.c. function. Then f is strongly convex w.r.t. norm with constant µ > 0 if and only f has Lischitz continuous gradient w.r.t. norm with constant L = 1/µ. From the revious two results, we immediately have the following lemma. Lemma 3.4. Suose f : R n R, Q is a centrally symmetric convex set with nonemty interior and let A R n n be a nonsingular matrix. Suose f is strongly convex with resect to the norm Q with arameter > 0, i.e. f(y) f(x) + f(x), y x + 1 y x Q, x, y Q, if and only if the function ˆf(w) f(aw) is strongly convex with resect to the norm A 1 Q with the same arameter. We now turn our attention to the choice of rox function in Algorithm 1. 4

5 3.. Choosing the Prox. Choosing the norm as Q allows us to define a norm without introducing an arbitrary geometry in the algorithm, since the norm is extracted directly from the roblem definition. Notice furthermore that by Theorem 3.3 when ( Q ) is smooth, we can set d(x) = x Q. The immediate imact of this choice is that the term d(x ) in (7) is bounded by one, by construction. This choice has other natural benefits which are highlighted below. We first recall a result showing that the conjugate of a squared norm is the squared dual norm. Lemma 3.5. Let be a norm and its dual norm, then 1 ( y ) = su x y T x 1 x. Proof. We recall the roof in [Boyd and Vandenberghe, 004, Examle 3.7] as it will rove useful in what follows. By definition, x T y y x, hence y T x 1 x y x 1 x 1 ( y ) because the second term is a quadratic function of x, with maximum ( y ) /. This maximum is attained by any x such that x T y = y x (there must be one by construction of the dual norm), normalized so x = y, which yields the desired result. Comuting the rox-maing in (4) amounts to taking the conjugate of, so this last result (and its roof) shows that, in the unconstrained case, solving the rox maing is equivalent to finding a vector aligned with the gradient, with resect to the Minkowski norm Q. We now recall another simle result showing that the dual of the norm Q is given by Q where Q is the olar of the set Q. Lemma 3.6. Let Q be a centrally symmetric convex set with nonemty interior, then Q = Q. Proof. We write which is the desired result. x Q = inf{λ 0 : x λq } = inf{λ 0 : x T y λ, for all y Q} { } = inf λ 0 : su x T y λ y Q = su x T y = x Q y Q In the light of the results above, we conclude that whenever Q is smooth we obtain a natural rox function d(x) = x Q, whose strong convexity arameter is controlled by the Lischitz constant of the gradient of Q. However, this does not cover the case where the squared norm Q is not strongly convex. In that scenario, we need to ick the norm based on Q but find a strongly convex rox function not too different from Q. This is exactly the dual of the roblem studied by Juditsky and Nemirovski [008] who worked on concentration inequalities for vector-valued martingales and defined the regularity of a Banach sace (E, E ) in terms of the smoothness of the best smooth aroximation of the norm. E. We first recall a few more definitions, and we will then show that the regularity constant defined by Juditsky and Nemirovski [008] roduces an affine invariant bound on the term d(x )/ in the comlexity of the smooth algorithm in [Nesterov, 1983]. Definition 3.7. Suose X and Y are two norms on a sace E, the distortion d( X, Y ) between these two norms is equal to the smallest roduct ab > 0 such that over all x E. 1 b x Y x X a x Y Note that log d( X, Y ) defines a metric on the set of all symmetric convex bodies in R n, called the Banach-Mazur distance. We then recall the regularity definition in [Juditsky and Nemirovski, 008]. 5

6 Definition 3.8. The regularity constant of a Banach sace (E,. ) is the smallest constant > 0 for which there exists a smooth norm (x) such that (i) (x) / has a Lischitz continuous gradient with constant µ w.r.t. the norm (x), with 1 µ, (ii) the norm (x) satisfies hence d((x),. ) /µ. x (x) µ x, for all x E (10) Note that in finite dimension, since all norms are equivalent to the Euclidean norm with distortion at most dim E, we know that all finite dimensional Banach saces are at least (dim E)-regular. Furthermore, the regularity constant is invariant with resect to an affine change of coordinates since both the distortion and the smoothness bounds are. We are now ready to rove the main result of this section. Proosition 3.9. Let ε > 0 be the target recision, suose that the function f has a Lischitz continuous gradient with constant L Q with resect to the norm Q and that the sace (R n, Q ) is Q-regular, then Algorithm 1 will roduce an ε-solution to roblem (1) in at most 4LQ Q (11) ε iterations. The constants L Q and Q are affine invariant. Proof. If (R n, Q ) is Q-regular, then by Definition 3.8, there exists a norm (x) such that (x) / has a Lischitz continuous gradient with constant µ with resect to the norm (x), and [Juditsky and Nemirovski, 008, Pro. 3.] shows by conjugacy that the rox function d(x) (x) / is strongly convex with resect to the norm (x) with constant 1/µ. Now (10) means that µ x Q (x) x Q, for all x Q Q since =, hence d(x + y) d(x) + d(x), y + 1 µ (y) d(x) + d(x), y + 1 Q y Q so d(x) is strongly convex with resect to Q with constant = 1/ Q, and using (10) as above d(x ) = (x ) Q x Q Q Q by definition of Q, if x is an otimal (hence feasible) solution of roblem (1). The bound in (11) then follows from (7) and its affine invariance follows directly from affine invariance of the distortion and Lemmas 3. and 3.4. In Section 5, we will see that, when Q is an l ball, the comlexity bound in (11) is otimal for most, but not all, high dimensional regimes in (n, ε) where n is greater than ε 1/. In the section that follows, we thus extend Algorithm 1 to much more general roblems, otimizing Hölder smooth functions using -uniformly convex rox functions (not just strongly convex ones). These additional degrees of freedom will allow us to match otimal comlexity lower bounds, with adativity in the Hölder smoothness arameter as a bonus. 6

7 4. HÖLDER SMOOTH FUNCTIONS & UNIFORMLY CONVEX PROX We now extend the results of Section 3 to roblems where the objective f(x) is Hölder smooth and the rox function is -uniformly convex, with arbitrary. This generalization is necessary to derive otimal comlexity bounds for smooth convex otimization over l -balls when >, and will require some extensions of the ideas we resented for the standard analysis, which was based on a strongly convex rox. We will consider a slightly different accelerated method, that can be seen as a combination of mirror and gradient stes Allen-Zhu and Orecchia [014]. This variant of the accelerated gradient method is not substantially different however from the one used in the revious section, and its urose is to make the ste-size analysis more transarent. It is worth emhasizing that an interesting byroduct of our method is the analysis of an adative ste-size olicy, which can exloit weaker levels of Hölder continuity for the gradient. In order to motivate our choice of -uniformly convex rox, we begin with an examle highlighting how the difficult geometries of l -saces when > necessarily lead to weak (dimension-deendent) comlexity bounds for any rox. Examle 4.1. Let < and let B be the unit -ball on R n. Let Ψ : R n R be any strongly convex (rox) function w.r.t. with constant 1, and suose w.l.o.g. that Ψ(0) = 0. We will rove that su Ψ(x) n 1 / /. x B We start from oint x 0 = 0 and choose a direction e 1+ {e 1, e 1 } so that Ψ(0), e By strong convexity, we have, for x 1 x 0 + e 1 n 1/, Ψ(x 1 ) Ψ(x 0 ) + Ψ(0), e x 1 x 0 1 n /. Inductively, we can roceed adding coordinate vectors one by one, x i x i 1 + e i+, for i = 1,..., n, n 1/ where e i+ {e i, e i } is chosen so that Ψ(x i 1 ), e i+ 0. For this choice we can guarantee Ψ(x i ) Ψ(x i 1 ) + Ψ(x i 1 ), e i+ + 1 n / At the end, the vector x n B and Ψ(x n ) n 1 / /. i n / Uniform Convexity and Smoothness. The revious examle shows that strong convexity of the rox function is too restrictive when dealing with certain domain geometries, such as Q = B when >. In order to obtain dimension-indeendent bounds for these cases, we will have to consider relaxed notions of regularity for the rox, namely -uniform convexity and its dual notion of q-uniform smoothness. For simlicity, we will only consider the case of subdifferentiable convex functions, which suffices for our uroses. Definition 4. (Uniform convexity and uniform smoothness). Let <, µ > 0 and Q R n, a closed convex set. A subdifferentiable function Ψ : Q R is -uniformly convex with constant µ w.r.t. iff for all x Q, y Q, Ψ(y) Ψ(x) + Ψ(x), y x + µ y x. (1) Now let 1 < q and L > 0. A subdifferentiable function Φ : Q R is q-uniformly smooth with constant L w.r.t. iff for all x Q, y Q, Φ(y) Φ(x) + Φ(x), y x + L q y x q. (13) From now on, whenever the constant µ of -uniform convexity is not exlicitly stated, µ = 1. We turn our attention to the question of how to obtain an affine invariant rox in the uniformly convex setu. In the revious section it was observed that the regularity constant of the dual sace rovided such tuning among strongly convex rox functions, however we are not aware of extensions of this notion to the uniformly smooth setu. Nevertheless, the same urose can be achieved by directly minimizing the growth factor among the class of uniformly convex functions, which leads to the following notion. 7

8 Definition 4.3 (Constant of variation). Given a -uniformly convex function Ψ : R n R, we define its constant of variation on Q as D Ψ (Q) su x Q Ψ(x) inf x Q Ψ(x). Furthermore, we define { } D,Q inf su Ψ(x) Ψ Ψ : Q R + is -uniformly convex w.r.t., Ψ(0) = 0. (14) x Q Some comments are in order. First, for fixed, the constant D,Q rovides the otimal constant of variation among -uniformly convex functions over Q, which means that, by construction, D,Q is affineinvariant. Second, Examle 4.1 showed that when < <, we have D,B n 1 / /, and the function Ψ(x) = x / shows this bound is tight. We will later see that D,B = 1, which is a major imrovement for large dimensions. [Juditsky and Nemirovski, 008, Pro. 3.3] also shows that Q D,Q c Q, where Q is the regularity constant defined in (3.8) and c > 0 is an absolute constant, since Ψ(x) is not required to be a norm here. When Q is the unit ball of a norm, a classical result by Pisier [1975] links the constant of variation in (14) above with the notion of martingale cotye. A Banach sace (E,. ) has M-cotye q iff there is some constant C > 0 such that for any T 1 and martingale difference d 1,..., d T E we have ( T 1/q [ E [ d t ]) q ] C T E t=1 d t t=1 Pisier [1975] then shows the following result. Theorem 4.4. [Pisier, 1975] A Banach sace (E,. ) has M-cotye q iff there exists a q uniformly convex norm equivalent to. In the same sirit, there exists a concrete characterization of a function achieving the otimal constant of variation, see e.g. [Srebro et al., 011]. Unfortunately, this characterization does not lead to an efficiently comutable rox. For the analysis of our accelerated method with uniformly convex rox, we will also need the notion of Bregman divergence. Definition 4.5 (Bregman divergence). Let (R n, ) be a normed sace, and Ψ : Q R be a -uniformly convex function w.r.t.. We define the Bregman divergence as V x (y) Ψ(y) Ψ(x), y x Ψ(x) x Q, y Q. Observe that V x (x) = 0 and V x (y) 1 y x. For starters, let us rove a simle fact that will be useful in the comlexity bounds. Lemma 4.6. Let Ψ : Q R be a -uniformly convex function, and V x ( ) the corresonding Bregman divergence. Then, for all x, x and u in Q Proof. From simle algebra V x (u) V x (u) V x (x ) V x (u) V x (u) V x (x ) = V x (x ), u x. = Ψ(u) Ψ(x), u x Ψ(x) [Ψ(u) Ψ(x ), u x Ψ(x )] V x (x ) = Ψ(x ) Ψ(x), u x + Ψ(x ) Ψ(x), x x Ψ(x) V }{{} x (x ). =V x(x ) = Ψ(x ) Ψ(x), u x = V x (x ), u x. which is the desired result. 8

9 4.. An Accelerated Method for Minimizing Hölder Smooth Functions. We consider classes of weaklysmooth convex functions. For a Hölder exonent (1, ] we denote the class F (Q, L ) as the set of convex functions f : Q R such that for all x, y Q f(x) f(y) L x y 1. Before describing the method, we first define a ste sequence that is useful in the algorithm. For a given such that <, consider the sequence (γ t ) t 0 defined by γ 1 = 1 and for any t > 1, γ t+1 is the major root of This sequence has the following roerties. γ t+1 = γ 1 t+1 + γ t. Proosition 4.7. The following roerties hold for the auxiliary sequence (γ t ) t 0 (i) The sequence is increasing. (ii) γ t = t s=1 γ 1 s. (iii) t γ t t. (iv) t s=1 γ s tγ t. Proof. We get (i) By definition, γ t+1 = γ 1 t+1 + γ t γ t, thus γ t+1 γ t. (ii) By telescoing the recursion, γ t = t s=1 γ 1 s. (iii) For the lower bound, a Fenchel tye inequality yields γ t = γ 1/ t+1 [γ t+1 1] 1/ γ t+1 + γ t+1 1 = γ t+1 1. The uer bound is roved by induction as follows (t + 1) = (t + 1) 1 + t(t + 1) 1 > (t + 1) 1 + t[t 1 + ( 1)t ] (t + 1) 1 + γ t, where the last inequality holds by induction hyothesis, γ t t. As a conclusion, the major root defining γ t+1 has to be at most t + 1. (iv) By (ii), we have, t γs = s=1 which concludes the roof. t s s=1 r=1 γ 1 s = t (t r)γ 1 t r=1 r t r=1 γ 1 r = tγ t. We now rove a simle Lemma controlling the smoothness of f terms of. This idea is a minor extension of the inexact gradient trick roosed in [Devolder et al., 011] and further studied in [Nesterov, 015], which corresonds to the secial case where = in the results described here. As in [Devolder et al., 011], this trick will allow us to minimize Hölder smooth functions by treating their gradient as an inexact oracle on the gradient of a smooth function. Lemma 4.8. Let f F (Q, L ), then for any δ > 0 and we have that for all x, y Q M [ ( ) ] 1 δ L (15) f(y) f(x) + f(x), y x + 1 M y x + δ. 9

10 Proof. By assumtion on f, the following bound holds for any x, y Q Notice first that it suffices to show that for all t 0 f(y) f(x) + f(x), y x + L y x. L t M t + δ. (16) This can be seen by letting t = y x and using (16) in the receding inequality. Let us rove (16). First recall the following Fenchel tye inequality: if r, s 1 and 1/r + 1/s = 1 then for all x and y we have that xy 1 r xr + 1 s ys. For r = /, s = /( ) and x = t, we obtain L t 1 L y t + L ( ) y. Now we choose y so that δ = L( ) y, which leads to the inequality L t 1 L y t + δ. Finally, by our choice of M we have that M L /y, roving (16) and therefore the result. Algorithm Accelerated Method with Bregman Prox Inut: x 0 Q 1: y 0 = x 0, z 0 = x 0, and A 0 = 0 : for t = 0,..., T 1 do 3: α t+1 = γ 1 t+1 /M 4: A t+1 = A t + α t+1 5: τ t = α t+1 /A t+1 6: x t+1 = τ t z t + (1 τ t )y t 7: Obtain from oracle f(x t+1 ), and udate 8: end for 9: return y T y t+1 = arg min y Q { M y x t+1 + f(x t+1 ), y x t+1 z t+1 = arg min z Q {V z t (z) + α t+1 f(x t+1 ), z z t } (18) } (17) As we will show below, the accelerated method described in Algorithm extends the l -setting for acceleration first roosed by [Nemirovskii and Nesterov, 1985] to nonsmooth saces, using Bregman divergences. This gives us more flexibility in the choice of rox function and allows us in articular to better fit the geometry of the feasible set. Proosition 4.9. Let f F (Q, L ) and Ψ : Q R be -uniformly convex w.r.t.. Then for any ε > 0, setting δ ε/t, and M satisfying (15), the accelerated method in Algorithm guarantees an accuracy f(y T ) f(y ) D Ψ(Q) A T + ε after T iterations. 10

11 Proof. Let u Q be an arbitrary vector. Using the otimality conditions for subroblem (18), and Lemma 4.6, we get α t+1 f(x t+1 ), z t u = α t+1 f(x t+1 ), z t z t+1 + α t+1 f(x t+1 ), z t+1 u α t+1 f(x t+1 ), z t z t+1 V zt (z t+1 ), z t+1 u = α t+1 f(x t+1 ), z t z t+1 + V zt (u) V zt+1 (u) V zt (z t+1 ) [α t+1 f(x t+1 ), z t z t+1 1 ] z t z t+1 + V zt (u) V zt+1 (u). Let us examine the latter term in brackets closely. For this, let v = τ t z t+1 + (1 τ t )y t and note that x t+1 v = τ t (z t z t+1 ). With τ t = α t+1 /A t+1 we also have, using Proosition 4.7 (ii), ( 1 L ) t+1 τ s=1 = γ 1 s t Lγ 1 = γ t+1 = MA t+1. t+1 From this we obtain α t+1 f(x t+1 ), z t z t+1 1 z t z t+1 αt+1 = f(x t+1 ), x t+1 v 1 τ t τ x t+1 v t [ = A t+1 f(x t+1 ), x t+1 v M ] x t+1 v [ A t+1 f(x t+1 ), x t+1 y t+1 M ] x t+1 y t+1 [ A t+1 f(x t+1 ) f(y t+1 ) + δ ], where the first inequality holds by the definition of y t+1, and the last inequality holds by Lemma 4.8 and the choice of M. This means that α t+1 f(x t+1 ), z t u A t+1 [f(x t+1 ) f(y t+1 ) + δ/] + V zt (u) V zt+1 (u). (19) From (19) and other simle estimates Therefore α t+1 [f(x t+1 ) f(u)] α t+1 f(x t+1 ), x t+1 u = α t+1 f(x t+1 ), x t+1 z t + α t+1 f(x t+1 ), z t u = (1 τ t)α t+1 τ t f(x t+1 ), y t x t+1 + α t+1 f(x t+1 ), z t u (1 τ t)α t+1 τ t [f(y t ) f(x t+1 )] + α t+1 f(x t+1 ), z t u (1 τ t)α t+1 [f(y t ) f(x t+1 )] + A t+1 [f(x t+1 ) f(y t+1 ) + δ/] + V zt (u) V zt+1 (u) τ t = (A t+1 α t+1 )[f(y t ) f(x t+1 )] + A t+1 [f(x t+1 ) f(y t+1 ) + δ/] + V zt (u) V zt+1 (u). A t+1 f(y t+1 ) A t f(y t ) + V zt+1 (u) V zt (u) α t+1 f(u) + A t+1 δ. Summing these inequlities, we obtain A T f(y T ) + [V zt+1 (u) V z0 (u)] A T f(u) + 11 T t=1 A t δ.

12 Now, by Proosition 4.7, we have 1 T t=1 A A t 1 T γ T γ T T f(y T ) f(u) V z 0 (u) A T + ε. T, thus by the choice δ = ε/t, we obtain Definition 4.5 together with the fact that Ψ(x), y x 0 when x minimizes Ψ(x) over Q then yields the desired result. In order to obtain the convergence rate of the method, we need to estimate the value of A T given the choice of M. For this we assume the bound in (15) is satisfied with equality. Since A T = γ T /M we can use Proosition 4.7 (iii), so that [ ( ) ] A T = γ ε T L T T +1 ε 1 [ ( )] L. Notice that to obtain an ε-solution it suffices to have A T D Ψ (Q)/ε. By imosing this lower bound on the lower bound obtained for A T we get the following comlexity estimate. Corollary Let f F (Q, L ) and Ψ : X R be -uniformly convex w.r.t.. Setting δ ε/t, and M satisfying (15), the accelerated method in Algorithm requires [ ( ) ] T < D Ψ (Q) L 1 (+1) ε + 1. iterations to reach an accuracy ε. We will later see that the algorithm above leads to otimal comlexity bounds (that is, unimrovable u to constant factors), for l -setus. However, our algorithm is highly sensitive to several arameters, the most imortant being (the smoothness) and L which sets the ste-size. We now focus on designing an adative ste-size olicy, that does not require L as inut, and adats itself to the best weak smoothness arameter (1, ] An Adative Gradient Method. We will now extend the adative algorithm in [Nesterov, 015, Th.3] to handle -uniformly convex rox functions using Bregman divergences. This new method with adative ste-size olicy is described as Algorithm 3. From line 5 in Algorithm 3 we get the following identities A 1 = α M (1) 1 τ = MA. () These identities are analogous to the ones derived for the non-adative variant. For this reason, the analysis of the adative variant is almost identical. There are a few extra details to address, which is what we do now. First, we need to show that the line-search rocedure is feasible. That is, it always terminates in finite time. This is intuitively true from Lemma 4.8, but let us make this intuition recise. From (1) and () we have Mτ = A 1 α ( α A ) 1 = 1 α ( A α ) 1 α. Notice that whenever the condition (0) of Algorithm 3 is not satisfied, M is increased by a factor two. Suose the line-search does not terminate, then α 0. However, by Lemma 4.8, the termination condition (0) is guaranteed to be satisfied as soon as [ ( ) ] 1 M L, ετ 1

13 Algorithm 3 Accelerated Method with Bregman Prox and Adative Stesize Inut: x 0 Q 1: Set y 0 = x 0, z 0 = x 0, M 0 = 1 and A 0 = 0. : for t = 0,..., T 1 do 3: M = M t / 4: reeat 5: Set M = M α = max A = A t + α τ = α/a x t+1 = τz t + (1 τ)y t {a : M 1 1 a a = A t } 6: Obtain f(x t+1 ), and comute 7: until y t+1 = arg min y Q { M y x t+1 + f(x t+1 ), y x t+1 f(y t+1 ) f(x t+1 ) + f(x t+1 ), y t+1 x t+1 + M y t+1 x t+1 + τε } (0) 8: Set M t+1 = M/, α t+1 = α, A t+1 = A, τ t = τ. 9: Comute 10: end for 11: return y z t+1 = arg min z Q {V z t (z) + α t+1 f(x t+1 ), z z t } which is a contradiction with α 0. To roduce convergence rates, we need a lower bound on the sequence A t. Unfortunately, the analysis in [Nesterov, 015] only works when =, we will thus use a different argument. First, notice that by the line-search rule from which we obtain M t+1 [ ( ) 1 ετ t ] L, This allows us to conclude α t+1 = τ t A t+1 = A 1 t+1 1 A 1 1 t+1 [ ε M t+1 [ ( ( α (+1) t+1 1 [ ( ε )] 13 ) ετ t ] 1 L L )] A t+1 α 1 t+1. L A t+1,

14 which gives an inequality involving α t+1 and A t+1 ( [ ( α t+1 (+1) ε )] (+1) L (+1) ) A (+1) t+1. Here is where we need to deart from Nesterov s analysis, as the condition γ 1/ in that roof does not hold. Instead, we show the following bound. Lemma Suose α t 0, α 0 = 0 and A t = t j=0 α j, satisfy for some s [0, 1[ and β 0. Then, for any t 0. α t βa s t A t ((1 s)βt) 1 1 s Proof. The sequence A t follows the recursion A t A t 1 βa s t. The function h(x) x βx s satisfies h(0) = 0, h (0 + ) < 0 and h (x) only has a single ositive root. Hence, when A t 1 > 0, the equation A t βa s t = A t 1 in the variable A t only has a single ositive root, after which h(a t ) is increasing. This means that to get a lower bound on A t it suffices to consider the extreme case of the sequence satisfying A t A t 1 = βa s t. Because A t is increasing, the sequence A t A t 1 is increasing, hence there exists an increasing, convex, iecewise affine function A(t) that interolates A t, whose breakoints are located at integer values of t. By construction, this function A(t) satisfies for any t / N. In articular, the interolant satisfies A (t) = A t+1 A t = α t+1 βa s t+1 βa(t)s A (t) βa s (t) (3) for any t 0. Note that 1/A s (t) is a convergent integral around 0, as A(t) is linear around 0, and A ( ) can be defined as a right continuous nondecreasing function, which is furthermore constant around 0; therefore the involved functions are integrable, and the Theorem of change of variables holds. Integrating the differential inequality we get t A (t) A(t) βt 0 A s (t) dt = du 0 u s = A(t)1 s 1 s, yielding the desired result. Using Lemma 4.11 with s = ( )/(( + 1) ) roduces the following bound on A T A T 1 ( ) (+1) ( ) ε L T (+1). ( + 1) To guarantee that A T D Ψ (Q)/ε, it suffices to imose ( D T C(, ) Ψ (Q)L where ε ( ) ( ( + 1) ( ) C(, ) 14 ) 1 (+1) ) (+1) (+1).

15 Corollary 4.1. Let f F (Q, L ) and Ψ : X R is -uniformly convex w.r.t.. Then the number of iterations required by Algorithm 3 to roduce a solution with accuracy ε is bounded by [ ( D T C(, ) Ψ (Q)L ) 1 ] (+1). inf 1< From Corollary 4.1 we obtain the affine-invariant bound on iteration comlexity. Given a centrally symmetric convex body Q R n, we choose the norm as its Minkowski gauge = Q, and - uniformly convex rox as the minimizer defining the otimal -variation constant, su x Q Ψ(x) = D,Q. With these choices, the iteration comlexity is T inf 1< ε C(, ) ( D,Q L,Q ε ) 1 (+1), where L,Q is the Hölder constant of f quantified in the Minkowski gauge norm Q. As a consequence, the bound above is affine-invariant, since also D,Q is affine-invariant by construction. Observe our iteration bound automatically adats to the best ossible weak smoothness arameter (1, ]; note however that an imlementable algorithm requires an accuracy certificate in order to sto with this adative bound. These details are beyond the scoe of this aer, but we refer to [Nesterov, 015] for details. Finally, we will see in what follows that this affine invariant bound also matches corresonding lower bounds when Q is an l ball. 5. EXPLICIT BOUNDS ON PROBLEMS OVER l BALLS 5.1. Uer Bounds. To illustrate our results, first consider the roblem of minimizing a smooth convex function over the unit simlex, written minimize f(x) subject to 1 T x 1, x 0, in the variable x R n. As discussed in [Juditsky et al., 009, 3.3], choosing 1 as the norm and d(x) = log n+ n i=1 x i log x i as the rox function, we have = 1 and d(x ) log n, which means the comlexity of solving (4) using Algorithm 1 is bounded by 8 L 1 log n (5) ε where L 1 is the Lischitz constant of f with resect to the l 1 norm. This choice of norm and rox has a double advantage here. First, the rox term d(x ) grows only as log n with the dimension. Second, the l norm being the smallest among all l norms, the smoothness bound L 1 is also minimal among all choices of l norms. Let us now follow the construction of Section 3. The simlex C = {x R n : 1 T x 1, x 0} is not centrally symmetric, but we can symmetrize it as the l 1 ball. The Minkowski norm associated with that set is then equal to the l 1 -norm, so Q = 1 here. The sace (R n, ) is log n regular [Juditsky and Nemirovski, 008, Examle 3.] with the rox function chosen here as α/, with α = log n/( log n 1). Proosition 3.9 then shows that the comlexity bound we obtain using this rocedure is identical to that in (5). A similar result holds in the matrix case Strongly Convex Prox. We can generalize this result to all cases where Q is an l ball. When [1, ], [Juditsky et al., 009, Ex. 3.] shows that the dual norm is regular, with 1 { } = inf (ρ 1)n ρ ( 1) min ρ< 1, C log n, when [1, ] (4)

16 When [, ], the regularity is only controlled by the distortion d( 1, ), since α is only smooth when α. This means that 1 is regular, with This means that the comlexity of solving = n, when [, ]. minimize f(x) (6) subject to x B in the variable x R n, where B is the l ball, using Algorithm 1, is bounded by 4L where L is the Lischitz constant of f with resect to the l norm. We will later see that this bound is nearly otimal when 1 ; however, the dimension deendence on the bounds when > is essentially subotimal. In order to obtain the otimal methods in this range we will need our -uniformly convex extensions Uniformly Convex Bregman Prox. In the case <, the function Ψ (w) = 1 w is - uniformly convex w.r.t. (see, e.g. [Ball et al., 1994]), and thus D,B = 1, ε when [, ]. As a consequence, Algorithm with Ψ as -uniformly requires T C() ( L ε ) + iterations to reach a target recision ε, where C() is a constant only deending on (which nevertheless diverges as ). This comlexity guarantee admits assage to the limit with a oly-logarithmic extra factor. Note however that in this case we can avoid any dimension deendence by the much simler Frank-Wolfe method. 5.. Lower Bounds. We show that in the case of l balls estimates from the roosed methods are nearly otimal in terms of information-based comlexity. We consider the class of roblems given by the minimization of smooth convex objectives with a bound L on the Lischitz constant of their gradients w.r.t. norm, and the feasible domain given the radius R > 0 ball B (R). We emhasize that the lower bounds we resent only hold for the large-scale regime, where the number of iterations T is uer bounded by the dimension of the sace, n. It is well-known that when one can afford a suer-linear (in dimension) number of iterations, methods such as the center of gravity or ellisoid can achieve better comlexity estimates [Nemirovskii and Yudin, 1979]. First, in the range 1 we can immediately use the lower bound on risk from [Guzmán and Nemirovski, 015], Ω ( L R T log(t + 1) where T is the number of iterations, which translates into the following lower bound on iteration comlexity Ω L R ε log n as a function of the target recision ε > 0. Therefore, the affine invariant algorithm is otimal, u to oly-logarithmic factors, in this range. 16 ) (7) (8)

17 For the second range, <, the lower bound states the accuracy after T stes is no better than ( ) L R Ω min[, log n] T 1+/, which translates into the iteration comlexity lower bound ( ) Ω L R +. min[, log n]ε For fixed <, this lower bound matches u to constant factors our iteration comlexity obtained for these setus. For the case =, our algorithm also turns out to be otimal, u to olylogarithmic in the dimension factors. 6. NUMERICAL RESULTS We now briefly illustrate the numerical erformance of our methods on a simle roblem taken from [Nesterov, 015]. To test the adativity of Algorithm 3, we focus on solving the following continuous Steiner roblem min x 1 m x x i q (9) i=1 in the variable x R n, with arameters x i R n for i = 1,..., m. The arameter q [1, ] controls the Hölder continuity of the objective. We samle the oints x uniformly at random in the cube [0, 1] n. We set n = 50, m = 10 and the target recision ε = We comare iterates with the otimum obtained using CVX [Grant et al., 001]. We observe that while the algorithm solving the three cases q = 1, 1.5, is identical, it is significantly faster on smoother roblems, as forecast by the adative bound in Corollary 4.1. ft f 10 5 q=1 q=1.5 q= Iteration M q=1 q=1.5 q= Iteration FIGURE 1. We test the adativity of of Algorithm 3. Left: Convergence lot of Algorithm 3 alied to the continuous Steiner roblem (9) for q = 1, 1.5,. Right: Value of the local smoothness arameter M across iterations. 17

18 7. CONCLUSION From a ractical oint of view, the results above offer guidance in the choice of a rox function deending on the geometry of the feasible set Q. On the theoretical side, these results rovide affine invariant descritions of the comlexity of an otimization roblem based on both the geometry of the feasible set and of the smoothness of the objective function. In our first algorithm, this comlexity bound is written in terms of the regularity constant of the olar of the feasible set and the Lischitz constant of f with resect to the Minkowski norm. In our last two methods, the regularity constant is relaced by a Bregman diameter constructed from an otimal choice of rox. When Q is an l ball, matching lower bounds on iteration comlexity for the algorithm in [Nesterov, 1983] show that these bounds are otimal in terms of target recision, smoothness and roblem dimension, u to a olylogarithmic term. However, while we show that it is ossible to formulate an affine invariant imlementation of the otimal algorithm in [Nesterov, 1983], we do not yet show that this is always a good idea outside of the l case... In articular, given our choice of norm the constants L Q and Q are both affine invariant, with L Q otimal by our choice of rox function minimizing Q over all smooth square norms. However, outside of the cases where Q is an l ball, this does not mean that our choice of norm (Minkowski gauge of a centrally symmetric feasible set) minimizes the roduct L Q min{ Q /, n}, hence that we achieve the best ossible bound for the comlexity of the smooth algorithm in [Nesterov, 1983] and its derivatives. Furthermore, while our bounds give clear indications of what an otimal choice of rox should look like, given a choice of norm, this characterization is not constructive outside of secial cases like l -balls. ACKNOWLEDGMENTS AA and MJ would like to acknowledge suort from the Euroean Research Council (roject SIPA). MJ also acknowledges suort from the Swiss National Science Foundation (SNSF). The authors would also like to acknowledge suort from the chaire Économie des nouvelles données, the data science joint research initiative with the fonds AXA our la recherche and a gift from Société Générale Cross Asset Quantitative Research. REFERENCES Zeyuan Allen-Zhu and Lorenzo Orecchia. Linear couling: An ultimate unification of gradient and mirror descent. arxiv rerint arxiv: , 014. K. Ball, E.A. Carlen, and E.H. Lieb. Shar uniform convexity and smoothness inequalities for trace norms. Inventiones mathematicae, 115(1):463 48, doi: /BF S. Boyd and L. Vandenberghe. Convex Otimization. Cambridge University Press, 004. O. Devolder, F. Glineur, and Y. Nesterov. First-order methods of smooth convex otimization with inexact oracle. CORE Discussion Paers,(011/0), 011. M. Grant, S. Boyd, and Y. Ye. CVX: Matlab software for discilined convex rogramming Cristóbal Guzmán and Arkadi Nemirovski. On lower comlexity bounds for large-scale smooth convex otimization. Journal of Comlexity, 31(1):1 14, 015. Jean-Batiste Hiriart-Urruty and Claude Lemaréchal. Convex Analysis and Minimization Algorithms. Sringer, A. Juditsky and A.S. Nemirovski. Large deviations of vector-valued martingales in -smooth normed saces. arxiv rerint arxiv: , 008. A. Juditsky, G. Lan, A. Nemirovski, and A. Shairo. Stochastic aroximation aroach to stochastic rogramming. SIAM Journal on Otimization, 19(4): , 009. A. Nemirovskii and D. Yudin. Problem comlexity and method efficiency in otimization. Nauka (ublished in English by John Wiley, Chichester, 1983),

19 AS Nemirovskii and Yu E Nesterov. Otimal methods of smooth convex minimization. USSR Comutational Mathematics and Mathematical Physics, 5():1 30, Y. Nesterov. A method of solving a convex rogramming roblem with convergence rate O(1/k ). Soviet Mathematics Doklady, 7():37 376, Y. Nesterov. Introductory Lectures on Convex Otimization. Sringer, 003. Y. Nesterov. Smooth minimization of non-smooth functions. Mathematical Programming, 103(1):17 15, 005. Y. Nesterov and A. Nemirovskii. Interior-oint olynomial algorithms in convex rogramming. Society for Industrial and Alied Mathematics, Philadelhia, Yu Nesterov. Universal gradient methods for convex otimization roblems. Mathematical Programming, 15(1-): , 015. Gilles Pisier. Martingales with values in uniformly convex saces. Israel Journal of Mathematics, 0(3-4):36 350, Nati Srebro, Karthik Sridharan, and Ambuj Tewari. On the universality of online mirror descent. In Advances in Neural Information Processing Systems 4: 5th Annual Conference on Neural Information Processing Systems 011., ages , 011. CNRS & D.I., UMR 8548, ÉCOLE NORMALE SUPÉRIEURE, PARIS, FRANCE. address: asremon@ens.fr FACULTAD DE MATEMÁTICAS & ESCUELA DE INGENIERÍA, PONTIFICIA UNIVERSIDAD CATÓLICA DE CHILE address: crguzman@uc.cl ECOLE POLYTECHNIQUE FÉDÉRALE DE LAUSANNE, SWITZERLAND. address: m.jaggi@gmail.com 19

An Optimal Affine Invariant Smooth Minimization Algorithm.

An Optimal Affine Invariant Smooth Minimization Algorithm. An Optimal Affine Invariant Smooth Minimization Algorithm. Alexandre d Aspremont, CNRS & École Polytechnique. Joint work with Martin Jaggi. Support from ERC SIPA. A. d Aspremont IWSL, Moscow, June 2013,

More information

Convex Optimization methods for Computing Channel Capacity

Convex Optimization methods for Computing Channel Capacity Convex Otimization methods for Comuting Channel Caacity Abhishek Sinha Laboratory for Information and Decision Systems (LIDS), MIT sinhaa@mit.edu May 15, 2014 We consider a classical comutational roblem

More information

Some results of convex programming complexity

Some results of convex programming complexity 2012c12 $ Ê Æ Æ 116ò 14Ï Dec., 2012 Oerations Research Transactions Vol.16 No.4 Some results of convex rogramming comlexity LOU Ye 1,2 GAO Yuetian 1 Abstract Recently a number of aers were written that

More information

Sums of independent random variables

Sums of independent random variables 3 Sums of indeendent random variables This lecture collects a number of estimates for sums of indeendent random variables with values in a Banach sace E. We concentrate on sums of the form N γ nx n, where

More information

4. Score normalization technical details We now discuss the technical details of the score normalization method.

4. Score normalization technical details We now discuss the technical details of the score normalization method. SMT SCORING SYSTEM This document describes the scoring system for the Stanford Math Tournament We begin by giving an overview of the changes to scoring and a non-technical descrition of the scoring rules

More information

Convex Games in Banach Spaces

Convex Games in Banach Spaces Technical Reort TTIC-TR-2009-6 Setember 2009 Convex Games in Banach Saces Karthik Sridharan Toyota Technological Institute at Chicago karthik@tti-c.org Ambuj Tewari Toyota Technological Institute at Chicago

More information

Notes on duality in second order and -order cone otimization E. D. Andersen Λ, C. Roos y, and T. Terlaky z Aril 6, 000 Abstract Recently, the so-calle

Notes on duality in second order and -order cone otimization E. D. Andersen Λ, C. Roos y, and T. Terlaky z Aril 6, 000 Abstract Recently, the so-calle McMaster University Advanced Otimization Laboratory Title: Notes on duality in second order and -order cone otimization Author: Erling D. Andersen, Cornelis Roos and Tamás Terlaky AdvOl-Reort No. 000/8

More information

HENSEL S LEMMA KEITH CONRAD

HENSEL S LEMMA KEITH CONRAD HENSEL S LEMMA KEITH CONRAD 1. Introduction In the -adic integers, congruences are aroximations: for a and b in Z, a b mod n is the same as a b 1/ n. Turning information modulo one ower of into similar

More information

Elementary Analysis in Q p

Elementary Analysis in Q p Elementary Analysis in Q Hannah Hutter, May Szedlák, Phili Wirth November 17, 2011 This reort follows very closely the book of Svetlana Katok 1. 1 Sequences and Series In this section we will see some

More information

On Wald-Type Optimal Stopping for Brownian Motion

On Wald-Type Optimal Stopping for Brownian Motion J Al Probab Vol 34, No 1, 1997, (66-73) Prerint Ser No 1, 1994, Math Inst Aarhus On Wald-Tye Otimal Stoing for Brownian Motion S RAVRSN and PSKIR The solution is resented to all otimal stoing roblems of

More information

Lecture 6. 2 Recurrence/transience, harmonic functions and martingales

Lecture 6. 2 Recurrence/transience, harmonic functions and martingales Lecture 6 Classification of states We have shown that all states of an irreducible countable state Markov chain must of the same tye. This gives rise to the following classification. Definition. [Classification

More information

Approximating min-max k-clustering

Approximating min-max k-clustering Aroximating min-max k-clustering Asaf Levin July 24, 2007 Abstract We consider the roblems of set artitioning into k clusters with minimum total cost and minimum of the maximum cost of a cluster. The cost

More information

Location of solutions for quasi-linear elliptic equations with general gradient dependence

Location of solutions for quasi-linear elliptic equations with general gradient dependence Electronic Journal of Qualitative Theory of Differential Equations 217, No. 87, 1 1; htts://doi.org/1.14232/ejqtde.217.1.87 www.math.u-szeged.hu/ejqtde/ Location of solutions for quasi-linear ellitic equations

More information

On a Markov Game with Incomplete Information

On a Markov Game with Incomplete Information On a Markov Game with Incomlete Information Johannes Hörner, Dinah Rosenberg y, Eilon Solan z and Nicolas Vieille x{ January 24, 26 Abstract We consider an examle of a Markov game with lack of information

More information

ON THE NORM OF AN IDEMPOTENT SCHUR MULTIPLIER ON THE SCHATTEN CLASS

ON THE NORM OF AN IDEMPOTENT SCHUR MULTIPLIER ON THE SCHATTEN CLASS PROCEEDINGS OF THE AMERICAN MATHEMATICAL SOCIETY Volume 00, Number 0, Pages 000 000 S 000-9939XX)0000-0 ON THE NORM OF AN IDEMPOTENT SCHUR MULTIPLIER ON THE SCHATTEN CLASS WILLIAM D. BANKS AND ASMA HARCHARRAS

More information

IMPROVED BOUNDS IN THE SCALED ENFLO TYPE INEQUALITY FOR BANACH SPACES

IMPROVED BOUNDS IN THE SCALED ENFLO TYPE INEQUALITY FOR BANACH SPACES IMPROVED BOUNDS IN THE SCALED ENFLO TYPE INEQUALITY FOR BANACH SPACES OHAD GILADI AND ASSAF NAOR Abstract. It is shown that if (, ) is a Banach sace with Rademacher tye 1 then for every n N there exists

More information

Convex Analysis and Economic Theory Winter 2018

Convex Analysis and Economic Theory Winter 2018 Division of the Humanities and Social Sciences Ec 181 KC Border Conve Analysis and Economic Theory Winter 2018 Toic 16: Fenchel conjugates 16.1 Conjugate functions Recall from Proosition 14.1.1 that is

More information

Efficient algorithms for the smallest enclosing ball problem

Efficient algorithms for the smallest enclosing ball problem Efficient algorithms for the smallest enclosing ball roblem Guanglu Zhou, Kim-Chuan Toh, Jie Sun November 27, 2002; Revised August 4, 2003 Abstract. Consider the roblem of comuting the smallest enclosing

More information

Sharp gradient estimate and spectral rigidity for p-laplacian

Sharp gradient estimate and spectral rigidity for p-laplacian Shar gradient estimate and sectral rigidity for -Lalacian Chiung-Jue Anna Sung and Jiaing Wang To aear in ath. Research Letters. Abstract We derive a shar gradient estimate for ositive eigenfunctions of

More information

Uniform Law on the Unit Sphere of a Banach Space

Uniform Law on the Unit Sphere of a Banach Space Uniform Law on the Unit Shere of a Banach Sace by Bernard Beauzamy Société de Calcul Mathématique SA Faubourg Saint Honoré 75008 Paris France Setember 008 Abstract We investigate the construction of a

More information

MATH 2710: NOTES FOR ANALYSIS

MATH 2710: NOTES FOR ANALYSIS MATH 270: NOTES FOR ANALYSIS The main ideas we will learn from analysis center around the idea of a limit. Limits occurs in several settings. We will start with finite limits of sequences, then cover infinite

More information

Various Proofs for the Decrease Monotonicity of the Schatten s Power Norm, Various Families of R n Norms and Some Open Problems

Various Proofs for the Decrease Monotonicity of the Schatten s Power Norm, Various Families of R n Norms and Some Open Problems Int. J. Oen Problems Comt. Math., Vol. 3, No. 2, June 2010 ISSN 1998-6262; Coyright c ICSRS Publication, 2010 www.i-csrs.org Various Proofs for the Decrease Monotonicity of the Schatten s Power Norm, Various

More information

NONLINEAR OPTIMIZATION WITH CONVEX CONSTRAINTS. The Goldstein-Levitin-Polyak algorithm

NONLINEAR OPTIMIZATION WITH CONVEX CONSTRAINTS. The Goldstein-Levitin-Polyak algorithm - (23) NLP - NONLINEAR OPTIMIZATION WITH CONVEX CONSTRAINTS The Goldstein-Levitin-Polya algorithm We consider an algorithm for solving the otimization roblem under convex constraints. Although the convexity

More information

Asymptotically Optimal Simulation Allocation under Dependent Sampling

Asymptotically Optimal Simulation Allocation under Dependent Sampling Asymtotically Otimal Simulation Allocation under Deendent Samling Xiaoing Xiong The Robert H. Smith School of Business, University of Maryland, College Park, MD 20742-1815, USA, xiaoingx@yahoo.com Sandee

More information

Research Article An iterative Algorithm for Hemicontractive Mappings in Banach Spaces

Research Article An iterative Algorithm for Hemicontractive Mappings in Banach Spaces Abstract and Alied Analysis Volume 2012, Article ID 264103, 11 ages doi:10.1155/2012/264103 Research Article An iterative Algorithm for Hemicontractive Maings in Banach Saces Youli Yu, 1 Zhitao Wu, 2 and

More information

1 Riesz Potential and Enbeddings Theorems

1 Riesz Potential and Enbeddings Theorems Riesz Potential and Enbeddings Theorems Given 0 < < and a function u L loc R, the Riesz otential of u is defined by u y I u x := R x y dy, x R We begin by finding an exonent such that I u L R c u L R for

More information

GOOD MODELS FOR CUBIC SURFACES. 1. Introduction

GOOD MODELS FOR CUBIC SURFACES. 1. Introduction GOOD MODELS FOR CUBIC SURFACES ANDREAS-STEPHAN ELSENHANS Abstract. This article describes an algorithm for finding a model of a hyersurface with small coefficients. It is shown that the aroach works in

More information

Elementary theory of L p spaces

Elementary theory of L p spaces CHAPTER 3 Elementary theory of L saces 3.1 Convexity. Jensen, Hölder, Minkowski inequality. We begin with two definitions. A set A R d is said to be convex if, for any x 0, x 1 2 A x = x 0 + (x 1 x 0 )

More information

On Z p -norms of random vectors

On Z p -norms of random vectors On Z -norms of random vectors Rafa l Lata la Abstract To any n-dimensional random vector X we may associate its L -centroid body Z X and the corresonding norm. We formulate a conjecture concerning the

More information

Estimation of the large covariance matrix with two-step monotone missing data

Estimation of the large covariance matrix with two-step monotone missing data Estimation of the large covariance matrix with two-ste monotone missing data Masashi Hyodo, Nobumichi Shutoh 2, Takashi Seo, and Tatjana Pavlenko 3 Deartment of Mathematical Information Science, Tokyo

More information

Real Analysis 1 Fall Homework 3. a n.

Real Analysis 1 Fall Homework 3. a n. eal Analysis Fall 06 Homework 3. Let and consider the measure sace N, P, µ, where µ is counting measure. That is, if N, then µ equals the number of elements in if is finite; µ = otherwise. One usually

More information

ON THE LEAST SIGNIFICANT p ADIC DIGITS OF CERTAIN LUCAS NUMBERS

ON THE LEAST SIGNIFICANT p ADIC DIGITS OF CERTAIN LUCAS NUMBERS #A13 INTEGERS 14 (014) ON THE LEAST SIGNIFICANT ADIC DIGITS OF CERTAIN LUCAS NUMBERS Tamás Lengyel Deartment of Mathematics, Occidental College, Los Angeles, California lengyel@oxy.edu Received: 6/13/13,

More information

CHAPTER 2: SMOOTH MAPS. 1. Introduction In this chapter we introduce smooth maps between manifolds, and some important

CHAPTER 2: SMOOTH MAPS. 1. Introduction In this chapter we introduce smooth maps between manifolds, and some important CHAPTER 2: SMOOTH MAPS DAVID GLICKENSTEIN 1. Introduction In this chater we introduce smooth mas between manifolds, and some imortant concets. De nition 1. A function f : M! R k is a smooth function if

More information

arxiv: v2 [math.na] 6 Apr 2016

arxiv: v2 [math.na] 6 Apr 2016 Existence and otimality of strong stability reserving linear multiste methods: a duality-based aroach arxiv:504.03930v [math.na] 6 Ar 06 Adrián Németh January 9, 08 Abstract David I. Ketcheson We rove

More information

An Estimate For Heilbronn s Exponential Sum

An Estimate For Heilbronn s Exponential Sum An Estimate For Heilbronn s Exonential Sum D.R. Heath-Brown Magdalen College, Oxford For Heini Halberstam, on his retirement Let be a rime, and set e(x) = ex(2πix). Heilbronn s exonential sum is defined

More information

Positivity, local smoothing and Harnack inequalities for very fast diffusion equations

Positivity, local smoothing and Harnack inequalities for very fast diffusion equations Positivity, local smoothing and Harnack inequalities for very fast diffusion equations Dedicated to Luis Caffarelli for his ucoming 60 th birthday Matteo Bonforte a, b and Juan Luis Vázquez a, c Abstract

More information

Inference for Empirical Wasserstein Distances on Finite Spaces: Supplementary Material

Inference for Empirical Wasserstein Distances on Finite Spaces: Supplementary Material Inference for Emirical Wasserstein Distances on Finite Saces: Sulementary Material Max Sommerfeld Axel Munk Keywords: otimal transort, Wasserstein distance, central limit theorem, directional Hadamard

More information

On Isoperimetric Functions of Probability Measures Having Log-Concave Densities with Respect to the Standard Normal Law

On Isoperimetric Functions of Probability Measures Having Log-Concave Densities with Respect to the Standard Normal Law On Isoerimetric Functions of Probability Measures Having Log-Concave Densities with Resect to the Standard Normal Law Sergey G. Bobkov Abstract Isoerimetric inequalities are discussed for one-dimensional

More information

Introduction to Banach Spaces

Introduction to Banach Spaces CHAPTER 8 Introduction to Banach Saces 1. Uniform and Absolute Convergence As a rearation we begin by reviewing some familiar roerties of Cauchy sequences and uniform limits in the setting of metric saces.

More information

Analysis of some entrance probabilities for killed birth-death processes

Analysis of some entrance probabilities for killed birth-death processes Analysis of some entrance robabilities for killed birth-death rocesses Master s Thesis O.J.G. van der Velde Suervisor: Dr. F.M. Sieksma July 5, 207 Mathematical Institute, Leiden University Contents Introduction

More information

ON UNIFORM BOUNDEDNESS OF DYADIC AVERAGING OPERATORS IN SPACES OF HARDY-SOBOLEV TYPE. 1. Introduction

ON UNIFORM BOUNDEDNESS OF DYADIC AVERAGING OPERATORS IN SPACES OF HARDY-SOBOLEV TYPE. 1. Introduction ON UNIFORM BOUNDEDNESS OF DYADIC AVERAGING OPERATORS IN SPACES OF HARDY-SOBOLEV TYPE GUSTAVO GARRIGÓS ANDREAS SEEGER TINO ULLRICH Abstract We give an alternative roof and a wavelet analog of recent results

More information

Positive decomposition of transfer functions with multiple poles

Positive decomposition of transfer functions with multiple poles Positive decomosition of transfer functions with multile oles Béla Nagy 1, Máté Matolcsi 2, and Márta Szilvási 1 Deartment of Analysis, Technical University of Budaest (BME), H-1111, Budaest, Egry J. u.

More information

Proximal methods for the latent group lasso penalty

Proximal methods for the latent group lasso penalty Proximal methods for the latent grou lasso enalty The MIT Faculty has made this article oenly available. Please share how this access benefits you. Your story matters. Citation As Published Publisher Villa,

More information

A numerical implementation of a predictor-corrector algorithm for sufcient linear complementarity problem

A numerical implementation of a predictor-corrector algorithm for sufcient linear complementarity problem A numerical imlementation of a redictor-corrector algorithm for sufcient linear comlementarity roblem BENTERKI DJAMEL University Ferhat Abbas of Setif-1 Faculty of science Laboratory of fundamental and

More information

Best approximation by linear combinations of characteristic functions of half-spaces

Best approximation by linear combinations of characteristic functions of half-spaces Best aroximation by linear combinations of characteristic functions of half-saces Paul C. Kainen Deartment of Mathematics Georgetown University Washington, D.C. 20057-1233, USA Věra Kůrková Institute of

More information

Journal of Mathematical Analysis and Applications

Journal of Mathematical Analysis and Applications J. Math. Anal. Al. 44 (3) 3 38 Contents lists available at SciVerse ScienceDirect Journal of Mathematical Analysis and Alications journal homeage: www.elsevier.com/locate/jmaa Maximal surface area of a

More information

Topic 7: Using identity types

Topic 7: Using identity types Toic 7: Using identity tyes June 10, 2014 Now we would like to learn how to use identity tyes and how to do some actual mathematics with them. By now we have essentially introduced all inference rules

More information

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO)

Combining Logistic Regression with Kriging for Mapping the Risk of Occurrence of Unexploded Ordnance (UXO) Combining Logistic Regression with Kriging for Maing the Risk of Occurrence of Unexloded Ordnance (UXO) H. Saito (), P. Goovaerts (), S. A. McKenna (2) Environmental and Water Resources Engineering, Deartment

More information

Feedback-error control

Feedback-error control Chater 4 Feedback-error control 4.1 Introduction This chater exlains the feedback-error (FBE) control scheme originally described by Kawato [, 87, 8]. FBE is a widely used neural network based controller

More information

Understanding DPMFoam/MPPICFoam

Understanding DPMFoam/MPPICFoam Understanding DPMFoam/MPPICFoam Jeroen Hofman March 18, 2015 In this document I intend to clarify the flow solver and at a later stage, the article-fluid and article-article interaction forces as imlemented

More information

On split sample and randomized confidence intervals for binomial proportions

On split sample and randomized confidence intervals for binomial proportions On slit samle and randomized confidence intervals for binomial roortions Måns Thulin Deartment of Mathematics, Usala University arxiv:1402.6536v1 [stat.me] 26 Feb 2014 Abstract Slit samle methods have

More information

BEST CONSTANT IN POINCARÉ INEQUALITIES WITH TRACES: A FREE DISCONTINUITY APPROACH

BEST CONSTANT IN POINCARÉ INEQUALITIES WITH TRACES: A FREE DISCONTINUITY APPROACH BEST CONSTANT IN POINCARÉ INEQUALITIES WITH TRACES: A FREE DISCONTINUITY APPROACH DORIN BUCUR, ALESSANDRO GIACOMINI, AND PAOLA TREBESCHI Abstract For Ω R N oen bounded and with a Lischitz boundary, and

More information

1 Extremum Estimators

1 Extremum Estimators FINC 9311-21 Financial Econometrics Handout Jialin Yu 1 Extremum Estimators Let θ 0 be a vector of k 1 unknown arameters. Extremum estimators: estimators obtained by maximizing or minimizing some objective

More information

#A64 INTEGERS 18 (2018) APPLYING MODULAR ARITHMETIC TO DIOPHANTINE EQUATIONS

#A64 INTEGERS 18 (2018) APPLYING MODULAR ARITHMETIC TO DIOPHANTINE EQUATIONS #A64 INTEGERS 18 (2018) APPLYING MODULAR ARITHMETIC TO DIOPHANTINE EQUATIONS Ramy F. Taki ElDin Physics and Engineering Mathematics Deartment, Faculty of Engineering, Ain Shams University, Cairo, Egyt

More information

p-adic Measures and Bernoulli Numbers

p-adic Measures and Bernoulli Numbers -Adic Measures and Bernoulli Numbers Adam Bowers Introduction The constants B k in the Taylor series exansion t e t = t k B k k! k=0 are known as the Bernoulli numbers. The first few are,, 6, 0, 30, 0,

More information

Robust Solutions to Markov Decision Problems

Robust Solutions to Markov Decision Problems Robust Solutions to Markov Decision Problems Arnab Nilim and Laurent El Ghaoui Deartment of Electrical Engineering and Comuter Sciences University of California, Berkeley, CA 94720 nilim@eecs.berkeley.edu,

More information

Recursive Estimation of the Preisach Density function for a Smart Actuator

Recursive Estimation of the Preisach Density function for a Smart Actuator Recursive Estimation of the Preisach Density function for a Smart Actuator Ram V. Iyer Deartment of Mathematics and Statistics, Texas Tech University, Lubbock, TX 7949-142. ABSTRACT The Preisach oerator

More information

Dependence on Initial Conditions of Attainable Sets of Control Systems with p-integrable Controls

Dependence on Initial Conditions of Attainable Sets of Control Systems with p-integrable Controls Nonlinear Analysis: Modelling and Control, 2007, Vol. 12, No. 3, 293 306 Deendence on Initial Conditions o Attainable Sets o Control Systems with -Integrable Controls E. Akyar Anadolu University, Deartment

More information

On a class of Rellich inequalities

On a class of Rellich inequalities On a class of Rellich inequalities G. Barbatis A. Tertikas Dedicated to Professor E.B. Davies on the occasion of his 60th birthday Abstract We rove Rellich and imroved Rellich inequalities that involve

More information

Multiplicity of weak solutions for a class of nonuniformly elliptic equations of p-laplacian type

Multiplicity of weak solutions for a class of nonuniformly elliptic equations of p-laplacian type Nonlinear Analysis 7 29 536 546 www.elsevier.com/locate/na Multilicity of weak solutions for a class of nonuniformly ellitic equations of -Lalacian tye Hoang Quoc Toan, Quô c-anh Ngô Deartment of Mathematics,

More information

THE 2D CASE OF THE BOURGAIN-DEMETER-GUTH ARGUMENT

THE 2D CASE OF THE BOURGAIN-DEMETER-GUTH ARGUMENT THE 2D CASE OF THE BOURGAIN-DEMETER-GUTH ARGUMENT ZANE LI Let e(z) := e 2πiz and for g : [0, ] C and J [0, ], define the extension oerator E J g(x) := g(t)e(tx + t 2 x 2 ) dt. J For a ositive weight ν

More information

Applications to stochastic PDE

Applications to stochastic PDE 15 Alications to stochastic PE In this final lecture we resent some alications of the theory develoed in this course to stochastic artial differential equations. We concentrate on two secific examles:

More information

SCHUR S LEMMA AND BEST CONSTANTS IN WEIGHTED NORM INEQUALITIES. Gord Sinnamon The University of Western Ontario. December 27, 2003

SCHUR S LEMMA AND BEST CONSTANTS IN WEIGHTED NORM INEQUALITIES. Gord Sinnamon The University of Western Ontario. December 27, 2003 SCHUR S LEMMA AND BEST CONSTANTS IN WEIGHTED NORM INEQUALITIES Gord Sinnamon The University of Western Ontario December 27, 23 Abstract. Strong forms of Schur s Lemma and its converse are roved for mas

More information

Stochastic integration II: the Itô integral

Stochastic integration II: the Itô integral 13 Stochastic integration II: the Itô integral We have seen in Lecture 6 how to integrate functions Φ : (, ) L (H, E) with resect to an H-cylindrical Brownian motion W H. In this lecture we address the

More information

Boundary problems for fractional Laplacians and other mu-transmission operators

Boundary problems for fractional Laplacians and other mu-transmission operators Boundary roblems for fractional Lalacians and other mu-transmission oerators Gerd Grubb Coenhagen University Geometry and Analysis Seminar June 20, 2014 Introduction Consider P a equal to ( ) a or to A

More information

ON PRINCIPAL FREQUENCIES AND INRADIUS IN CONVEX SETS

ON PRINCIPAL FREQUENCIES AND INRADIUS IN CONVEX SETS ON PRINCIPAL FREQUENCIES AND INRADIUS IN CONVEX SES LORENZO BRASCO o Michelino Brasco, master craftsman and father, on the occasion of his 7th birthday Abstract We generalize to the case of the Lalacian

More information

One Mirror Descent Algorithm for Convex Constrained Optimization Problems with Non-Standard Growth Properties

One Mirror Descent Algorithm for Convex Constrained Optimization Problems with Non-Standard Growth Properties One Mirror Descent Algorithm for Convex Constrained Optimization Problems with Non-Standard Growth Properties Fedor S. Stonyakin 1 and Alexander A. Titov 1 V. I. Vernadsky Crimean Federal University, Simferopol,

More information

Oracle Complexity of Second-Order Methods for Smooth Convex Optimization

Oracle Complexity of Second-Order Methods for Smooth Convex Optimization racle Complexity of Second-rder Methods for Smooth Convex ptimization Yossi Arjevani had Shamir Ron Shiff Weizmann Institute of Science Rehovot 7610001 Israel Abstract yossi.arjevani@weizmann.ac.il ohad.shamir@weizmann.ac.il

More information

Radial Basis Function Networks: Algorithms

Radial Basis Function Networks: Algorithms Radial Basis Function Networks: Algorithms Introduction to Neural Networks : Lecture 13 John A. Bullinaria, 2004 1. The RBF Maing 2. The RBF Network Architecture 3. Comutational Power of RBF Networks 4.

More information

Solving Support Vector Machines in Reproducing Kernel Banach Spaces with Positive Definite Functions

Solving Support Vector Machines in Reproducing Kernel Banach Spaces with Positive Definite Functions Solving Suort Vector Machines in Reroducing Kernel Banach Saces with Positive Definite Functions Gregory E. Fasshauer a, Fred J. Hickernell a, Qi Ye b, a Deartment of Alied Mathematics, Illinois Institute

More information

A construction of bent functions from plateaued functions

A construction of bent functions from plateaued functions A construction of bent functions from lateaued functions Ayça Çeşmelioğlu, Wilfried Meidl Sabancı University, MDBF, Orhanlı, 34956 Tuzla, İstanbul, Turkey. Abstract In this resentation, a technique for

More information

Additive results for the generalized Drazin inverse in a Banach algebra

Additive results for the generalized Drazin inverse in a Banach algebra Additive results for the generalized Drazin inverse in a Banach algebra Dragana S. Cvetković-Ilić Dragan S. Djordjević and Yimin Wei* Abstract In this aer we investigate additive roerties of the generalized

More information

On the minimax inequality and its application to existence of three solutions for elliptic equations with Dirichlet boundary condition

On the minimax inequality and its application to existence of three solutions for elliptic equations with Dirichlet boundary condition ISSN 1 746-7233 England UK World Journal of Modelling and Simulation Vol. 3 (2007) No. 2. 83-89 On the minimax inequality and its alication to existence of three solutions for ellitic equations with Dirichlet

More information

Quantitative estimates of propagation of chaos for stochastic systems with W 1, kernels

Quantitative estimates of propagation of chaos for stochastic systems with W 1, kernels oname manuscrit o. will be inserted by the editor) Quantitative estimates of roagation of chaos for stochastic systems with W, kernels Pierre-Emmanuel Jabin Zhenfu Wang Received: date / Acceted: date Abstract

More information

Multi-Operation Multi-Machine Scheduling

Multi-Operation Multi-Machine Scheduling Multi-Oeration Multi-Machine Scheduling Weizhen Mao he College of William and Mary, Williamsburg VA 3185, USA Abstract. In the multi-oeration scheduling that arises in industrial engineering, each job

More information

LECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)]

LECTURE 7 NOTES. x n. d x if. E [g(x n )] E [g(x)] LECTURE 7 NOTES 1. Convergence of random variables. Before delving into the large samle roerties of the MLE, we review some concets from large samle theory. 1. Convergence in robability: x n x if, for

More information

Preconditioning techniques for Newton s method for the incompressible Navier Stokes equations

Preconditioning techniques for Newton s method for the incompressible Navier Stokes equations Preconditioning techniques for Newton s method for the incomressible Navier Stokes equations H. C. ELMAN 1, D. LOGHIN 2 and A. J. WATHEN 3 1 Deartment of Comuter Science, University of Maryland, College

More information

t 0 Xt sup X t p c p inf t 0

t 0 Xt sup X t p c p inf t 0 SHARP MAXIMAL L -ESTIMATES FOR MARTINGALES RODRIGO BAÑUELOS AND ADAM OSȨKOWSKI ABSTRACT. Let X be a suermartingale starting from 0 which has only nonnegative jums. For each 0 < < we determine the best

More information

HAUSDORFF MEASURE OF p-cantor SETS

HAUSDORFF MEASURE OF p-cantor SETS Real Analysis Exchange Vol. 302), 2004/2005,. 20 C. Cabrelli, U. Molter, Deartamento de Matemática, Facultad de Cs. Exactas y Naturales, Universidad de Buenos Aires and CONICET, Pabellón I - Ciudad Universitaria,

More information

A Note on Guaranteed Sparse Recovery via l 1 -Minimization

A Note on Guaranteed Sparse Recovery via l 1 -Minimization A Note on Guaranteed Sarse Recovery via l -Minimization Simon Foucart, Université Pierre et Marie Curie Abstract It is roved that every s-sarse vector x C N can be recovered from the measurement vector

More information

On Acceleration with Noise-Corrupted Gradients. + m k 1 (x). By the definition of Bregman divergence:

On Acceleration with Noise-Corrupted Gradients. + m k 1 (x). By the definition of Bregman divergence: A Omitted Proofs from Section 3 Proof of Lemma 3 Let m x) = a i On Acceleration with Noise-Corrupted Gradients fxi ), u x i D ψ u, x 0 ) denote the function under the minimum in the lower bound By Proposition

More information

MATH 6210: SOLUTIONS TO PROBLEM SET #3

MATH 6210: SOLUTIONS TO PROBLEM SET #3 MATH 6210: SOLUTIONS TO PROBLEM SET #3 Rudin, Chater 4, Problem #3. The sace L (T) is searable since the trigonometric olynomials with comlex coefficients whose real and imaginary arts are rational form

More information

Principal Components Analysis and Unsupervised Hebbian Learning

Principal Components Analysis and Unsupervised Hebbian Learning Princial Comonents Analysis and Unsuervised Hebbian Learning Robert Jacobs Deartment of Brain & Cognitive Sciences University of Rochester Rochester, NY 1467, USA August 8, 008 Reference: Much of the material

More information

Improved Bounds on Bell Numbers and on Moments of Sums of Random Variables

Improved Bounds on Bell Numbers and on Moments of Sums of Random Variables Imroved Bounds on Bell Numbers and on Moments of Sums of Random Variables Daniel Berend Tamir Tassa Abstract We rovide bounds for moments of sums of sequences of indeendent random variables. Concentrating

More information

Bent Functions of maximal degree

Bent Functions of maximal degree IEEE TRANSACTIONS ON INFORMATION THEORY 1 Bent Functions of maximal degree Ayça Çeşmelioğlu and Wilfried Meidl Abstract In this article a technique for constructing -ary bent functions from lateaued functions

More information

REGULARITY OF SOLUTIONS TO DOUBLY NONLINEAR DIFFUSION EQUATIONS

REGULARITY OF SOLUTIONS TO DOUBLY NONLINEAR DIFFUSION EQUATIONS Seventh Mississii State - UAB Conference on Differential Equations and Comutational Simulations, Electronic Journal of Differential Equations, Conf. 17 2009,. 185 195. ISSN: 1072-6691. URL: htt://ejde.math.txstate.edu

More information

Universal Finite Memory Coding of Binary Sequences

Universal Finite Memory Coding of Binary Sequences Deartment of Electrical Engineering Systems Universal Finite Memory Coding of Binary Sequences Thesis submitted towards the degree of Master of Science in Electrical and Electronic Engineering in Tel-Aviv

More information

2 K. ENTACHER 2 Generalized Haar function systems In the following we x an arbitrary integer base b 2. For the notations and denitions of generalized

2 K. ENTACHER 2 Generalized Haar function systems In the following we x an arbitrary integer base b 2. For the notations and denitions of generalized BIT 38 :2 (998), 283{292. QUASI-MONTE CARLO METHODS FOR NUMERICAL INTEGRATION OF MULTIVARIATE HAAR SERIES II KARL ENTACHER y Deartment of Mathematics, University of Salzburg, Hellbrunnerstr. 34 A-52 Salzburg,

More information

Difference of Convex Functions Programming for Reinforcement Learning (Supplementary File)

Difference of Convex Functions Programming for Reinforcement Learning (Supplementary File) Difference of Convex Functions Programming for Reinforcement Learning Sulementary File Bilal Piot 1,, Matthieu Geist 1, Olivier Pietquin,3 1 MaLIS research grou SUPELEC - UMI 958 GeorgiaTech-CRS, France

More information

On Doob s Maximal Inequality for Brownian Motion

On Doob s Maximal Inequality for Brownian Motion Stochastic Process. Al. Vol. 69, No., 997, (-5) Research Reort No. 337, 995, Det. Theoret. Statist. Aarhus On Doob s Maximal Inequality for Brownian Motion S. E. GRAVERSEN and G. PESKIR If B = (B t ) t

More information

A New Perspective on Learning Linear Separators with Large L q L p Margins

A New Perspective on Learning Linear Separators with Large L q L p Margins A New Persective on Learning Linear Searators with Large L q L Margins Maria-Florina Balcan Georgia Institute of Technology Christoher Berlind Georgia Institute of Technology Abstract We give theoretical

More information

Topic: Lower Bounds on Randomized Algorithms Date: September 22, 2004 Scribe: Srinath Sridhar

Topic: Lower Bounds on Randomized Algorithms Date: September 22, 2004 Scribe: Srinath Sridhar 15-859(M): Randomized Algorithms Lecturer: Anuam Guta Toic: Lower Bounds on Randomized Algorithms Date: Setember 22, 2004 Scribe: Srinath Sridhar 4.1 Introduction In this lecture, we will first consider

More information

Distributed Rule-Based Inference in the Presence of Redundant Information

Distributed Rule-Based Inference in the Presence of Redundant Information istribution Statement : roved for ublic release; distribution is unlimited. istributed Rule-ased Inference in the Presence of Redundant Information June 8, 004 William J. Farrell III Lockheed Martin dvanced

More information

Asymptotic Properties of the Markov Chain Model method of finding Markov chains Generators of..

Asymptotic Properties of the Markov Chain Model method of finding Markov chains Generators of.. IOSR Journal of Mathematics (IOSR-JM) e-issn: 78-578, -ISSN: 319-765X. Volume 1, Issue 4 Ver. III (Jul. - Aug.016), PP 53-60 www.iosrournals.org Asymtotic Proerties of the Markov Chain Model method of

More information

Positive Definite Uncertain Homogeneous Matrix Polynomials: Analysis and Application

Positive Definite Uncertain Homogeneous Matrix Polynomials: Analysis and Application BULGARIA ACADEMY OF SCIECES CYBEREICS AD IFORMAIO ECHOLOGIES Volume 9 o 3 Sofia 009 Positive Definite Uncertain Homogeneous Matrix Polynomials: Analysis and Alication Svetoslav Savov Institute of Information

More information

An Algorithm for Total Variation Minimization and Applications

An Algorithm for Total Variation Minimization and Applications Journal of Mathematical Imaging and Vision 0: 89 97, 004 c 004 Kluwer Academic Publishers. Manufactured in The Netherlands. An Algorithm for Total Variation Minimization and Alications ANTONIN CHAMBOLLE

More information

arxiv: v1 [physics.data-an] 26 Oct 2012

arxiv: v1 [physics.data-an] 26 Oct 2012 Constraints on Yield Parameters in Extended Maximum Likelihood Fits Till Moritz Karbach a, Maximilian Schlu b a TU Dortmund, Germany, moritz.karbach@cern.ch b TU Dortmund, Germany, maximilian.schlu@cern.ch

More information

Extension of Minimax to Infinite Matrices

Extension of Minimax to Infinite Matrices Extension of Minimax to Infinite Matrices Chris Calabro June 21, 2004 Abstract Von Neumann s minimax theorem is tyically alied to a finite ayoff matrix A R m n. Here we show that (i) if m, n are both inite,

More information

On the Chvatál-Complexity of Knapsack Problems

On the Chvatál-Complexity of Knapsack Problems R u t c o r Research R e o r t On the Chvatál-Comlexity of Knasack Problems Gergely Kovács a Béla Vizvári b RRR 5-08, October 008 RUTCOR Rutgers Center for Oerations Research Rutgers University 640 Bartholomew

More information

LEIBNIZ SEMINORMS IN PROBABILITY SPACES

LEIBNIZ SEMINORMS IN PROBABILITY SPACES LEIBNIZ SEMINORMS IN PROBABILITY SPACES ÁDÁM BESENYEI AND ZOLTÁN LÉKA Abstract. In this aer we study the (strong) Leibniz roerty of centered moments of bounded random variables. We shall answer a question

More information