Improved Fast Dual Gradient Methods for Embedded Model Predictive Control

Size: px
Start display at page:

Download "Improved Fast Dual Gradient Methods for Embedded Model Predictive Control"

Transcription

1 Preprints of the 9th World Congress The International Federation of Automatic Control Improved Fast Dual Gradient Methods for Embedded Model Predictive Control Pontus Giselsson Electrical Engineering, Stanford Universit Abstract: Recentl, several authors have suggested the use of first order methods, such as fast dual ascent and the alternating direction method of multipliers, for embedded model predictive control. The main reason is that the can be implemented using simple arithmetic operations onl. In this paper, we present results that enable for significant improvements when using fast dual ascent for embedded model predictive control. These improvements rel on a new characterization of the set of matrices that can be used to describe a quadratic upper bound to the negative dual function. For man interesting formulations, it is shown that the provided bound cannot be improved and that it is tighter than bounds previousl presented in the literature. The improved quadratic upper bound is used in fast dual gradient methods as an approimation of the dual function. Since it better approimates the dual function than previous upper bounds, the convergence is improved. The performance enhancement is most significant for ill-conditioned problems. This is illustrated b a numerical evaluation on a AFTI-6 aircraft model where the algorithm outperforms previous proposals of first order methods for embedded control with one to three orders of magnitude.. INTRODUCTION Several authors including O Donoghue et al. 03); Jerez et al. 03); Richter et al. 03); Patrinos and Bemporad 04) have recentl proposed first order optimization methods as appropriate for embedded model predictive control. In O Donoghue et al. 03); Jerez et al. 03), the alternating direction method of multipliers ADMM, see Bod et al. 0)) were used and high computational speeds were reported when implemented on embedded hardware. In Richter et al. 03); Patrinos and Bemporad 04), the optimal control problems arising in model predictive were solved using different formulations of fast dual gradient methods. In Richter et al. 03), the equalit constraints, i.e. the dnamic constraints, are dualized and a diagonal cost and bo constraints are assumed. The resulting dual problem is solved using a fast gradient method. In Patrinos and Bemporad 04), the same splitting as in O Donoghue et al. 03); Jerez et al. 03) is used, but a fast gradient method is used to solve the resulting problem as opposed to ADMM in O Donoghue et al. 03); Jerez et al. 03). In this paper, we will show how to generalize the fast dual gradient methods presented in Richter et al. 03); Patrinos and Bemporad 04) to achieve a faster convergence. Fast gradient methods as used in Richter et al. 03); Patrinos and Bemporad 04) have been around since During the preparation of this paper, the author was a member of the LCCC Linnaeus Center at Lund Universit. Financial support from the Swedish Research Council for the author s Postdoctoral studies at Stanford Universit is gratefull acknowledged. Further, Eric Chu is gratefull acknowledged for constructive feedback and Aleander Domahidi is gratefull acknowledged for suggesting the AFTI-6 control problem as a benchmark. the earl 80 s when the seminal paper Nesterov 983) was published. However, fast gradient methods did not render much attention before the mid 00 s, after which an increased interest has emerged. Several etensions and generalizations of the fast gradient method have been proposed, e.g. in Nesterov 003, 005). In Beck and Teboulle 009), the fast gradient method was generalized to allow for minimization of composite objective functions. Further, a unified framework for fast gradient methods and their generalizations were presented in Tseng 008). To use fast gradient methods for composite minimization, one objective term should be conve and differentiable with a Lipschitz continuous gradient, while the other should be proper, closed, and conve. The former condition is equivalent to the eistence of a quadratic upper bound to the function, with the same curvature in all directions. The curvature is specified b the Lipschitz constant to the gradient. In fast gradient methods, the quadratic upper bound serves as an approimation of the function to be d, since the bound is d in ever iteration of the algorithm. If the quadratic upper bound does not well approimate the function to be d, slow convergence properties are epected. B instead allowing for a quadratic upper bound with different curvature in different directions, as in generalized fast gradient methods Zuo and Lin 0), the bound can closer approimate the function to be d. For an appropriate choice of non-uniform quadratic upper bound, this can significantl improve the performance of the algorithm. In Nesterov, 005, Theorem ), a Lipschitz constant to the gradient of the dual function to strongl conve problems is presented. This result quantifies the curvature of a uniform quadratic upper bound to the negative dual function. This result was improved in Richter et al., 03, Copright 04 IFAC 303

2 Theorem 7) when the primal cost is restricted to being quadratic. Using these quadratic upper bounds, with the same curvature in all directions, as dual function approimation in a fast dual gradient method, ma result in slow convergence. Especiall for ill-conditioned problems where the upper bound does not well approimate the negative dual function. In this paper, the main result is a new characterization of the set of matrices that can be used to describe quadratic upper bounds to the negative dual function. This result generalizes and improves previous results in Nesterov 005); Richter et al. 03) to allow for a bound with different curvature in different directions. The provided set can be used to create a quadratic upper bound to the negative dual function with different curvature in different directions. Using this quadratic upper bound instead of the traditional uniform upper bound in fast dual gradient methods, significant improvements can beachieved.wealsoshowthatinmancases,theresulting quadratic upper bounds cannot be made tighter b using matrices outside this set. In model predictive control, much offline computational effort can be devoted to improve the online eecution time of the solver. This is done, e.g., in eplicit MPC, see Bemporad et al. 00), where the eplicit parametric solution is computed beforehand, and found through a lookup table online. In this paper, the offline computational effort is devoted to choose a matri from the provided set that s the difference, in some metric, between the negative dual function and the resulting quadratic upper bound. The computed matri is the same in all samples in the controller and can therefore be computed offline. The algorithm is evaluated on a pitch control problem in an AFTI-6 aircraft that has previousl been studied in Kapasouris et al. 990); Bemporad et al. 997). This is a challenging problem for first order methods since it is ver ill-conditioned. The numerical evaluation shows that the method presented in this paper outperforms the methods presented in O Donoghue et al. 03); Jerez et al. 03); Richter et al. 03); Patrinos and Bemporad 04) with one to three orders of magnitude. For space consideration, all proofs are omitted from this paper and can be found in the full versionpaper, Giselsson 04).. Notation. PRELIMINARIES AND NOTATION We denote b R, R n, R m n, the sets of real numbers, vectors, and matrices. S n R n n is the set of smmetric matrices, and S n Sn, [S n ] Sn, are the sets of positive [semi] definite matrices. Further, L M and L M where L,M S n denotes L M S n and L M S n respectivel. We also use notation, = T, = T, and H = T H. Finall, I X denotes theindicatorfunctionforthesetx,i.e.i X ) 0, X, else.. Preliminaries In this section, we introduce generalizations of well used concepts. We generalize the notion of strong conveit as well as the notion of Lipschitz continuit of the gradient of conve functions. We also define conjugate functions and state a known result on dual properties of a function and its conjugate. For differentiable and conve functions f : R n R that have a Lipschitz continuous gradient with constant L, we have that f ) f ) L ) holds for all, R n. This is equivalent to that f ) f ) f ), L ) holds for all, R n Nesterov, 003, Theorem..5). In this paper, we allow for a generalized version of the quadratic upper bound ) to f, namel that f ) f ) f ), L 3) holds for all, R n where L S n. The bound ) is obtained b setting L = LI in 3). Remark. For concave functions f, i.e. where f is conve, the Lipschitz condition ) is equivalent to that the following quadratic lower bound f ) f ) f ), L 4) holds for all, R n. The generalized counterpart naturall becomes that f ) f ) f ), L 5) holds for all, R n. Net, we state a Lemma on equivalent characterizations of the condition 3). Lemma. Assume that f : R n R is conve and differentiable. The condition that f ) f ) f ), L 6) holds for some L S n and all, R n is equivalent to that f ) f ), L. 7) holds for all, R n. The standard definition of a differentiable and strongl conve function f : R n R is that it satisfies f ) f ) f ), σ 8) foran, R n, where the modulus σ R describes a lower bound on the curvature of the function. In this paper, the definition 8) is generalized to allow for a quadratic lower bound with different curvature in different directions. Definition 3. A differentiable function f : R n R is strongl conve with matri H if and onl if f ) f ) f ), H holds for all, R n, where H S n. Remark 4. The traditional definition of strong conveit 8) is obtained from Definition 3 b setting H = σi. Lemma 5. Assume that f : R n R is differentiable and strongl conve with matri H. The condition that f ) f ) f ), H 9) 304

3 holds for all, R n is equivalent to that f ) f ), H 0) holds for all, R n. The condition 9) is a quadratic lower bound on the function value, while the condition3) is a quadratic upper bound on the function value. These two properties are linked through the conjugate function f ) sup T f) }. More precisel, we have the following result. Proposition 6. Assume that f : R n R } is closed, proper, and strongl conve with modulus σ on the relative interior of its domain. Then the conjugate function f is conve and differentiable, and f ) = ), where ) = argma T f) }.Further, f is Lipschitz continuous with constant L = σ. A straight-forward generalization is given b the chain-rule and was proven innesterov, 005, Theorem )which also proves the less general Proposition 6). Corollar 7. Assume that f : R n R } is closed, proper, and strongl conve with modulus σ on the relative interior of its domain. Further, define g ) f A). Then g is conve and differentiable, and g ) = A T A), where A) = argma A) T f) }. Further, g is Lipschitz continuous with constant L = A σ. For the case when f) = T H g T, i.e. f is a quadratic, a tighter Lipschitz constant to g ) = f A) was provided in Richter et al., 03, Theorem 7), namel L = AH A T. 3. PROBLEM FORMULATION We consider optimization problems of the form f)h)gb) subject to A = b ) where R n, A R m n, B R p n, b R m. We assume that the following assumption holds throughout the paper: Assumption 8. a) The function f : R n R is differentiable and strongl conve with matri H. b) Theetendedvaluedfunctionsh : R n R }and g : R n R }, are proper, closed, and conve. c) A R m n has full row rank. Remark 9. Eamples of functions that satisf Assumption 8a) and8b) aref) = T Hg T with H S n for Assumption 8a), and g = I X, g =, g = I X, or g = 0 for Assumption 8b). If Assumption 8c) is not satisfied, redundant equalit constraints can, without affecting the solution of ), be removed to satisf the assumption. The optimization problem) can equivalentl be written as f)h)g) ) subject to A = b B = We introduce dual variables λ R m for the equalit constraints A = b and dual variables µ R p for the equalit constraints B =. This gives the following Lagrange dual problem sup λ,µ inf f)h)λ T A b)g)µ T B ) }, [ sup A T λ B T µ) T f) h) } = sup λ,µ b T λ sup µ T g) }] = sup F A T λ B T µ) b T λ g µ) } 3) λ,µ where F is the conjugate function to F := f h and g is the conjugate function to g. For ease of eposition, we introduce ν = λ,µ) R mp, C = [A T B T ] T R mp) n, and c = b,0) R mp and the following function dν) := F C T ν) c T ν = F A T λ B T µ) b T λ. 4) The function F is evaluated b solving an optimization problem. The minimand to this problem is denoted b ν) := argmin F)ν T C } 5) = argmin f)h)λ T Aµ T B }. From Corollar7 we get that the function d is concaveand differentiable with gradient dν) = C ν) c and that d is Lipschitz continuous with constant L = C /λ minh), i.e., that dν ) dν ) L ν ν 6) holds for all ν,ν R mp. As stated in Remark, 6) is equivalent to that the following quadratic lower bound to the concave function d holds for all ν,ν R mp dν ) dν ) dν ),ν ν L ν ν. In the following section we will show that the function d satisfies the following tighter condition dν ) dν ) dν ),ν ν ν ν L 7) for all ν,ν R mp and L S mp that satisfies L CH C T. 4. DUAL FUNCTION PROPERTIES The following theorem states that the function d defined in 4) satisfies 7) for an L CH C T. The results of this section are proven in Giselsson 04). Theorem 0. The function d defined in 4) is concave, differentiable and satisfies dν ) dν ) dν ),ν ν ν ν L 8) for ever ν,ν R mp and L S mp that satisfies L CH C T. Net,weshowthatiff isastronglconvequadraticfunction and h satisfies certain conditions, then Theorem 0 gives the best possible bound of the form 8). Proposition. Assume that f) = T Hζ T with H S n and ζ Rn and that there eists a set X R n with non-empt interior on which h is linear, i.e. 305

4 h) = ξ T X θ X for all X. Further, assume that there eists ν such that ν) intx). Then for an matri L CH C T, there eist ν and ν such that 8) does not hold. Proposition shows that the bound in Theorem 0 is indeed the best obtainable bound of the form 8) if f is a quadratic and h specifies the stated assumptions. Eamples of functions that satisf the assumptions on h in Proposition include linear functions, indicator functions of closed conve constraint sets with non-empt interior, and the -norm. 5. FAST DUAL GRADIENT METHODS In this section, we will describe generalized fast gradient methods and show how the can be applied to solve the dual problem 3). Generalized fast gradient methods can be applied to solve problems of the form l) ψ) 9) where R n, ψ : R n R } is proper, closed and conve, l : R n R is conve, differentiable, and satisfies l ) l ) l ), L 0) for all, R n and some L S n. Before we state the algorithm, we define the generalized pro operator pro L ψ ) := argmin ψ) L and note that pro L ψ L l) ) = = argmin L l) L ψ) } } ) = argmin l) l), L ψ) }. The generalized fast gradient method is stated below. Algorithm. Generalized fast gradient method Set: = 0 R n,t = For k k = pro L ψ k L l k ) ) t k = 4t k ) k = k t k ) k k ) t k ) The standard fast gradient method as presented in Beck and Teboulle 009) is obtained b setting L = LI in Algorithm, where L is the Lipschitz constant to l. The main step of the fast gradient method is to perform a pro-step, i.e., to ) which can be seen as an approimation of the function l ψ. For the standard fast gradient method, l is approimated with a quadratic upper bound that has the same curvature, described b L, in all directions. If this quadratic upper bound is a bad approimation of the function to be d, slow convergence properties are epected. The generalization to allow for a matri L in the algorithm allows for quadratic upper bounds with different curvature in different directions. This enables for quadratic upper bounds that much better approimate the function l and consequentl gives improved convergence. The generalized fast gradient method has a convergence rate of see Zuo and Lin 0)) l ψ k ) l ψ ) 0 L k ) 3) where l ψ := lψ. The convergence rate of the standard fast gradient method as given in Beck and Teboulle 009), is obtained b setting L = LI in 3). The objective here is to appl the generalized fast gradient method to solve the dual problem 3). B introducing gν) = g [0 I]ν), the dual problem 3) can be epressed ma ν dν) gν), where d is defined in 4). As shown in Theorem 0, the function d satisfies the properties required to appl generalized fast gradient methods. Namel that 0) holdsfor anl S mp such that L CH C T. Further, since g is a closed, proper, and conve function so is g, see Rockafellar, 970, Theorem.), and b Rockafellar, 970, Theorem 5.7) so is g. This implies that generalized fast gradient methods, i.e. Algorithm, can be used to solve the dual problem 3). We set d = l and g = ψ, and restrict L = blkdiagl λ,l µ ) to get the following algorithm. Algorithm. Generalized fast dual gradient method Set: z = λ 0 R m,v = µ 0 R p,t = For k k = argmin f)h)z k ) T Av k ) T B } λ k = z k L λ Ak b) µ k = pro Lµ g vk L µ Bk ) t k = 4t k ) z k = λ k t k t )λ k λ k ) k v k = µ k t k )µ k µ k ) t k where k is the primal variable at iteration k that is used to help compute the gradient dν k ), where ν k = z k,v k ). To arrive at the λ k and µ k iterations, we let ξ k = λ k,µ k ), and note that ξ k = pro L g ν k L dν k ) ) 4) = argmin ν ν νk L dν k ) L g [0 I]ν) } [ argmin = z z zk L λ zdν k ) } ] L λ arg min v v vk L µ vdν k ) L µ g v) } [ z k L ] λ = Ak b) pro Lµ g vk L. µ Bk ) In the following proposition we state the convergence rate properties of Algorithm. Proposition. Suppose that Assumption 8 holds. If L = blkdiagl λ,l µ ) S mp is chosen such that L CH C T. Then Algorithm converges with the rate Dν ) Dν k ) ν ν 0 L k ), k 5) 306

5 where D = d g and k is the iteration number. Remark 3. B forming a specific running average of previous primal variables, it is possible to prove a O/k) convergence rate for the distance to the primal variable optimum and a O/k ) convergence rate for the worst case primal infeasibilit, see Patrinos and Bemporad 04). For some choices of conjugate functions g, pro Lµ g ) can be difficult to evaluate. For standard pro operatorsgiven b pro I g )), Moreau decomposition Rockafellar, 970, Theorem 3.5) states that pro I g )proi g) =. In the following proposition, we will generalize this result to hold for the generalized pro-operator used here. Proposition 4. Assume that g : R n R is a proper, closed, and conve function. Then pro L g )L pro L g L) = for ever R n and an L S n. Remark 5. If g = I X where I X is the indicator function, then g is the support function. Evaluating the pro operator ) with g being a support function is difficult. However, through Proposition 4, this can be rewritten to onl require the a projection operation onto the set X. If X is a bo constraint and L is diagonal, then the projection becomes a ma-operation and hence ver cheap to implement. Remark 6. We are not restricted to have one auiliar term g onl. We can have an number of auiliar terms g i that all decompose according to the computations in 4), i.e., we get one pro-operation in the algorithm for ever auiliar term g i. 6. MODEL PREDICTIVE CONTROL In this section, we pose some standard model predictive control problems and show how the can be solved using the methods presented in this paper. The resulting algorithms will have simple arithmetic operations onl which allows for easier implementation in embedded sstems. We also show how to choose the L-matri in each case. Eample 7. We consider MPC optimization problems of the form N T t Q t u T ) t Ru t T NQ f N t=0 subject to t = Φ t Γu t, t = 0,...,N min t ma, t = 0,...,N u min u t u ma, t = 0,...,N 0 = where, t R n, u t R nu, Φ R n n, Γ R n nu and Q S n, R S nu, Q f S n are all diagonal. Letting = 0,..., N,u 0,...,u N ), this can be cast as T H subject to A = b min ma where H, A, b, min, and ma are structured according to. We choose f) = T H, g = 0, and h = I Y where I Y is the indicator function to Y = R N)nNnu min ma }. This implicitl implies that we introduce dual variables λ for the equalit constraints A = b. The algorithm becomes: k = argmin T H I Y )z T A } 6) λ k = z k L λ Ak b ) 7) t k = 4t k ) 8) z k = λ k t k )λ k λ k ) 9) t k where the first step 6) can be implemented as k = ma min H A T z k ) ), ma,min 30) due to the structure of the problem. From Theorem 0, we know that L λ must satisf L λ AH A T. B Assumption 8 we have that A has full row rank, which is common in MPC. Further, A is sparse in MPC which renders L λ = AH A T a good choice. The algorithm requires the computation of L λ z, where z = Ak b, in each iteration. Since L λ = AH A is sparse, this can efficientl be implemented b offline storing the sparse Cholesk factorization R T R = S T L λ S, where R is sparse and upper triangular, and S is a permutation matri. The online computation of L λ z then reduces to one forward and one backward solve, which can be ver efficientl implemented. The algorithm in this eample is a generalization of the algorithm in Richter et al. 03), where the matri L is chosen as L = AH A T I. In the numerical section we will see that this generalization can significantl improve the convergence rate. Net, we present an algorithm that works for arbitrar positive definite cost matrices, and arbitrar linear constraints. Eample 8. We consider the MPC optimization problems of the form N t=0 subject to t = Φ t Γu t, B t d, B u u t d u, 0 =,B N N d N T t Q t u T t Ru t ) T NQ f N t = 0,...,N t = 0,...,N t = 0,...,N where, t R n, u t R nu, Φ R n n, Γ R n nu, B R p n, B u R pu n, B N R pn n, d R p, d u R pu, d N R pn, Q S n,r S nu, and Q f S n. We let = 0,..., N,u 0,...,u N ) and define B = blkdiag B,B N, B u ) where B = blkdiagb,...,b ) and B u = blkdiagb u,...,b u ). We also introduce d = d,...,d,d N,d u,...,d u ).Thisimpliesthatallinequalit constraints are described b B d. Using this notation, the optimization problem can be rewritten as T H subject to A = b B = v v d We let f) = T H, h = I A=b, and g = I Y where Y = R N)nNnu B d}. Since h is the indicator function for the equalit constraints A = b, we do not need to introduce dual variables for those 307

6 Table. Numerical evaluation of the dual gradient method proposed in this paper. eec time ms) nbr iters Algorithm Parameters avg. ma avg. ma 6)-9) L λ = AH A T Richter et al. 03) L λ = AH A T I )-34) L µ = BH B T 0 4 I Patrinos and Bemporad 04) L µ = BH B T I O Donoghue et al. 03); Jerez et al. 03) ρ = O Donoghue et al. 03); Jerez et al. 03) ρ = O Donoghue et al. 03); Jerez et al. 03) ρ = constraints. However, we introduce dual variables µ for B = v. Letting H A = AH A T, the algorithm becomes k = H A T H A AH B T v k b ) B T v k ) 3) µ k = pro Lµ g vk L µ Bk ) 3) t k = 4t k ) 33) v k = µ k t k )µ k µ k ) 34) t k where the k iterate follows from solving min f) I A=b ) v k ) T B }. In an implementation, the k - update can be implemented as in 3). Then, for efficienc, the matri multiplications should be computed offline and stored for online use. Depending on the sparsit of H, A, and B, it might be more efficient to use the KKT-sstem from which 3) is deduced, namel [ ][ ] [ ] H A T k B = T v k. A 0 ξ b Then, [ ] a sparse LDL-factorization of the KKT-matri H A T A 0 is computed offline for online use. The online computational burden to compute the k -update then becomes one forward and one backward solve. Whichever method that has the lower number of flops should be chosen. B restricting L µ to be diagonal, the second step, i.e. 3), can be implemented as µ k = ma 0,v k L µ B k d) ). To implement the algorithm, the matri L µ must also be chosen. From Theorem 0 we know that L µ BH B T. Usuall in MPC, B is a thin matri which implies that BH B T is positive semi-definite onl. One option is to choose L µ = BH B T ǫi, and to solve pro Lµ g vk L µ B k ) parametricall for fast eecution. Another option is to choose L µ b solving the following semi-definite program: tr L µ subject to L µ BH B T L µ L µ where L µ describes the sparsit structure of L µ, e.g., diagonal. The splitting method used here is the same as the one used in Patrinos and Bemporad 04). However, this is more general since we allow for L µ -matrices that are not a multiple of the identit matri. Also, the same splitting is used in O Donoghue et al. 03); Jerez et al. 03), where ADMM see Bod et al. 0)) is used to solve the optimization problem. 7. NUMERICAL EXAMPLE The proposed algorithms are evaluated b appling them to the AFTI-6 aircraft model in Kapasouris et al. 990); Bemporad et al. 997). This problem is also a tutorial eample in the MPC toolbo in MATLAB. As in Bemporad et al. 997) and the MPC toolbo tutorial, the continuous time model from Kapasouris et al. 990) is sampled using zero-order hold ever 0.05 s. The sstem has four states =,, 3, 4 ), two outputs =, ), two inputs u = u,u ), and obes the following dnamics = [ ] = [ ] [ ] u, where denotes the state in the net time step. The dnamics, input, and output matrices are denoted b Φ, Γ, C respectivel, i.e. we have = Φ Γu, = C. The sstem is unstable, the magnitude of the largest eigenvalue of the dnamics matri is.33. The outputs are the attack and pitch angles, while the inputs are the elevator and flaperon angles. The inputs are phsicall constrained to satisf u i 5, i =,. The outputs are soft constrained to satisf s s and s s 4 respectivel, where s = s,s,s 3,s 4 ) 0 are slack variables. The cost in each time step is l,u,s) = r ) T Q r )u T Rus T Ss ) where Q = C T Q C Q, where Q = 0 I and Q = diag0 4,0,0 3,0), r is such that r = C r where r is the output reference that can var in each step, R = 0 I, and S = 0 6 I. This gives condition number 0 0 of the full cost matri. Further, the terminal cost is Q, and the control and prediction horizon is N = 0. The numerical data in Table is obtained b following a reference trajector on the output. The objective is to change the pitch angle from 0 to 0 and then back to 0 while the angle of attack satisfies the output constraints The constraints on the angle of attack limits the rate on how fast the pitch angle can be changed. All algorithms in the comparison are implemented in MATLAB on a Linu machine using a single core running at.9 GHz. To create an easil transferable and fair termination criterion, the optimal solution to each optimization problem is computed to high accurac using an interior point solver. The optimalit condition is k / 0.005, where k is the primal iterate in the algorithm. This implies that a relative accurac of 0.5% of the primal solution is required. 308

7 The algorithms in Eample 7, i.e. 6)-9), and Eample 8, i.e. 3)-34), have been applied to this problem. Due to the slack variables, 30) cannot replace 6) for the k update. However, the k minimization is separable in the constraints and each of the projections can be solved b a multiparametric program with two regions. This is almost as computationall inepensive as the k update in 30). Further, we use L λ = AH A T. Algorithm 6)- 9) is a generalization of Richter et al. 03) that allows for general matrices L λ. The algorithm in Richter et al. 03) is obtained b setting L λ = AH A T I. The numerical evaluation in Table reveals that this generalization improves the eecution time with more than three orders of magnitude for this problem. The formulation in Eample 8, i.e. 3)-34), directl covers this MPC formulation with soft constraints. For this algorithm, we choose L µ = BH B T 0 4 I and solve 3), i.e. pro Lµ g vk L µ Bk ), parametricall. It turns out that also this parametric program is cheap to implement as it requires one ma-operation onl. The resulting algorithm is a generalization of the algorithm in Patrinos and Bemporad 04). The algorithm in Patrinos and Bemporad 04) is given b setting L µ = BH B T I in the iterations 3)-34). Table indicates that this generalization improves the algorithm b one to two orders of magnitude compared to Patrinos and Bemporad 04). Further, 3)-34) is based on the same splitting as the method in O Donoghue et al. 03); Jerez et al. 03). The difference is that here, the problem is solved with a generalized dual gradient method, while in O Donoghue et al. 03); Jerez et al. 03) it is solved using ADMM. In ADMM, the ρ-parameter need to be chosen. However, no eact guidelines are et known for this choice, and the performance of the algorithm often relies heavil on this parameter. We compare our algorithm with ADMM using the best ρ that we found, ρ = 4, and with one larger and one smaller ρ. Table reports that the eecution time for our method is one to two orders of magnitude smaller or more if the ρ-parameter in O Donoghue et al. 03); Jerez et al. 03) is chosen suboptimall) than the proposed algorithm in O Donoghue et al. 03); Jerez et al. 03). 8. CONCLUSIONS We have proposed a generalization of dual fast gradient methods. This generalization allows the algorithm to, in each iteration, a quadratic upper bound to the negative dual function with different curvature in different directions. This is in contrast to the standard fast dual gradient method where a quadratic upper bound to the negative dual with the same curvature in all directions is d in each iteration. This generalization is made possible b the main contribution of this paper that characterizes the set of matrices that can be used to describe a quadratic upper bound to the negative dual function. The numerical evaluation on an ill-conditioned aircraft problem reveals that the proposed algorithms outperform, with one to three orders of magnitude, other first order algorithms that have recentl been proposed to be suitable for embedded model predictive control. REFERENCES A. Beck and M. Teboulle. A fast iterative shrinkagethresholding algorithm for linear inverse problems. SIAM J. Imaging Sciences, ):83 0, October 009. A. Bemporad, A. Casavola, and E. Mosca. Nonlinear control of constrained linear sstems via predictive reference management. IEEE Transactions on Automatic Control, 43): , 997. ISSN doi: 0.09/ A. Bemporad, M. Morari, V. Dua, and E.N. Pistikopoulos. The eplicit linear quadratic regulator for constrained sstems. Automatica, 38):3 0, Januar 00. S. Bod, N. Parikh, E. Chu, B. Peleato, and J. Eckstein. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine Learning, 3):, 0. P. Giselsson. Improving fast dual ascent for MPC - Part II: The embedded case. Automatica, 04. Submitted. Available J. L. Jerez, P. J. Goulart, S. Richter, G. A. Constantinides, E. C. Kerrigan, and M. Morari. Embedded online optimization for model predictive control at megahertz rates. IEEE Transactions on Automatic Control, 03. Submitted. P. Kapasouris, M. Athans, and G. Stein. Design of feedback control sstems for unstable plants with saturating actuators. In Proceedings of the IFAC Smposium on Nonlinear Control Sstem Design, pages Pergamon Press, 990. Y. Nesterov. A method of solving a conve programming problem with convergence rate O /k ). Soviet Mathematics Doklad, 7):37 376, 983. Y. Nesterov. Introductor Lectures on Conve Optimization: A Basic Course. Springer Netherlands, st edition, 003. ISBN Y. Nesterov. Smooth minimization of non-smooth functions. Math. Program., 03):7 5, Ma 005. B. O Donoghue, G. Stathopoulos, and S. Bod. A splitting method for optimal control. IEEE Transactions on Control Sstems Technolog, 6):43 44, 03. P. Patrinos and A. Bemporad. An accelerated dual gradient-projection algorithm for embedded linear model predictive control. IEEE Transactions on Automatic Control, 59):8 33, 04. S. Richter, C. N. Jones, and M. Morari. Certification aspects of the fast gradient method for solving the dual of parametric conve programs. Mathematical Methods of Operations Research, 773):305 3, 03. K. T. Rockafellar. Conve Analsis, volume 8. Princeton Univercit Press, Princeton, NJ, 970. P. Tseng. On accelerated proimal gradient methods for conve-concave optimization. Technical report. Available: papers/apgm.pdf, Ma 008. W. Zuo and Z. Lin. A generalized accelerated proimal gradient approach for total-variation-based image restoration. IEEE Transactions on Image Processing, 00): , October

Monotonicity and Restart in Fast Gradient Methods

Monotonicity and Restart in Fast Gradient Methods 53rd IEEE Conference on Decision and Control December 5-7, 204. Los Angeles, California, USA Monotonicity and Restart in Fast Gradient Methods Pontus Giselsson and Stephen Boyd Abstract Fast gradient methods

More information

In applications, we encounter many constrained optimization problems. Examples Basis pursuit: exact sparse recovery problem

In applications, we encounter many constrained optimization problems. Examples Basis pursuit: exact sparse recovery problem 1 Conve Analsis Main references: Vandenberghe UCLA): EECS236C - Optimiation methods for large scale sstems, http://www.seas.ucla.edu/ vandenbe/ee236c.html Parikh and Bod, Proimal algorithms, slides and

More information

Diagonal Scaling in Douglas-Rachford Splitting and ADMM

Diagonal Scaling in Douglas-Rachford Splitting and ADMM 53rd IEEE Conference on Decision and Control December 15-17, 2014. Los Angeles, California, USA Diagonal Scaling in Douglas-Rachford Splitting and ADMM Pontus Giselsson and Stephen Boyd Abstract The performance

More information

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation

Course Notes for EE227C (Spring 2018): Convex Optimization and Approximation Course otes for EE7C (Spring 018): Conve Optimization and Approimation Instructor: Moritz Hardt Email: hardt+ee7c@berkeley.edu Graduate Instructor: Ma Simchowitz Email: msimchow+ee7c@berkeley.edu October

More information

A. Derivation of regularized ERM duality

A. Derivation of regularized ERM duality A. Derivation of regularized ERM dualit For completeness, in this section we derive the dual 5 to the problem of computing proximal operator for the ERM objective 3. We can rewrite the primal problem as

More information

Dual Methods. Lecturer: Ryan Tibshirani Convex Optimization /36-725

Dual Methods. Lecturer: Ryan Tibshirani Convex Optimization /36-725 Dual Methods Lecturer: Ryan Tibshirani Conve Optimization 10-725/36-725 1 Last time: proimal Newton method Consider the problem min g() + h() where g, h are conve, g is twice differentiable, and h is simple.

More information

BASE VECTORS FOR SOLVING OF PARTIAL DIFFERENTIAL EQUATIONS

BASE VECTORS FOR SOLVING OF PARTIAL DIFFERENTIAL EQUATIONS BASE VECTORS FOR SOLVING OF PARTIAL DIFFERENTIAL EQUATIONS J. Roubal, V. Havlena Department of Control Engineering, Facult of Electrical Engineering, Czech Technical Universit in Prague Abstract The distributed

More information

Introduction to Alternating Direction Method of Multipliers

Introduction to Alternating Direction Method of Multipliers Introduction to Alternating Direction Method of Multipliers Yale Chang Machine Learning Group Meeting September 29, 2016 Yale Chang (Machine Learning Group Meeting) Introduction to Alternating Direction

More information

RANGE CONTROL MPC APPROACH FOR TWO-DIMENSIONAL SYSTEM 1

RANGE CONTROL MPC APPROACH FOR TWO-DIMENSIONAL SYSTEM 1 RANGE CONTROL MPC APPROACH FOR TWO-DIMENSIONAL SYSTEM Jirka Roubal Vladimír Havlena Department of Control Engineering, Facult of Electrical Engineering, Czech Technical Universit in Prague Karlovo náměstí

More information

A Unified Approach to Proximal Algorithms using Bregman Distance

A Unified Approach to Proximal Algorithms using Bregman Distance A Unified Approach to Proximal Algorithms using Bregman Distance Yi Zhou a,, Yingbin Liang a, Lixin Shen b a Department of Electrical Engineering and Computer Science, Syracuse University b Department

More information

Dual Ascent. Ryan Tibshirani Convex Optimization

Dual Ascent. Ryan Tibshirani Convex Optimization Dual Ascent Ryan Tibshirani Conve Optimization 10-725 Last time: coordinate descent Consider the problem min f() where f() = g() + n i=1 h i( i ), with g conve and differentiable and each h i conve. Coordinate

More information

Decentralized Quadratically Approximated Alternating Direction Method of Multipliers

Decentralized Quadratically Approximated Alternating Direction Method of Multipliers Decentralized Quadratically Approimated Alternating Direction Method of Multipliers Aryan Mokhtari Wei Shi Qing Ling Alejandro Ribeiro Department of Electrical and Systems Engineering, University of Pennsylvania

More information

Self Equivalence of the Alternating Direction Method of Multipliers

Self Equivalence of the Alternating Direction Method of Multipliers Self Equivalence of the Alternating Direction Method of Multipliers Ming Yan and Wotao Yin Abstract The alternating direction method of multipliers (ADM or ADMM) breaks a comple optimization problem into

More information

Applications of Gauss-Radau and Gauss-Lobatto Numerical Integrations Over a Four Node Quadrilateral Finite Element

Applications of Gauss-Radau and Gauss-Lobatto Numerical Integrations Over a Four Node Quadrilateral Finite Element Avaiable online at www.banglaol.info angladesh J. Sci. Ind. Res. (), 77-86, 008 ANGLADESH JOURNAL OF SCIENTIFIC AND INDUSTRIAL RESEARCH CSIR E-mail: bsir07gmail.com Abstract Applications of Gauss-Radau

More information

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725

Dual methods and ADMM. Barnabas Poczos & Ryan Tibshirani Convex Optimization /36-725 Dual methods and ADMM Barnabas Poczos & Ryan Tibshirani Convex Optimization 10-725/36-725 1 Given f : R n R, the function is called its conjugate Recall conjugate functions f (y) = max x R n yt x f(x)

More information

QUADRATIC AND CONVEX MINIMAX CLASSIFICATION PROBLEMS

QUADRATIC AND CONVEX MINIMAX CLASSIFICATION PROBLEMS Journal of the Operations Research Societ of Japan 008, Vol. 51, No., 191-01 QUADRATIC AND CONVEX MINIMAX CLASSIFICATION PROBLEMS Tomonari Kitahara Shinji Mizuno Kazuhide Nakata Toko Institute of Technolog

More information

FPGA Implementation of a Predictive Controller

FPGA Implementation of a Predictive Controller FPGA Implementation of a Predictive Controller SIAM Conference on Optimization 2011, Darmstadt, Germany Minisymposium on embedded optimization Juan L. Jerez, George A. Constantinides and Eric C. Kerrigan

More information

Lecture 23: November 19

Lecture 23: November 19 10-725/36-725: Conve Optimization Fall 2018 Lecturer: Ryan Tibshirani Lecture 23: November 19 Scribes: Charvi Rastogi, George Stoica, Shuo Li Charvi Rastogi: 23.1-23.4.2, George Stoica: 23.4.3-23.8, Shuo

More information

Math 273a: Optimization Lagrange Duality

Math 273a: Optimization Lagrange Duality Math 273a: Optimization Lagrange Duality Instructor: Wotao Yin Department of Mathematics, UCLA Winter 2015 online discussions on piazza.com Gradient descent / forward Euler assume function f is proper

More information

Fast Model Predictive Control with Soft Constraints

Fast Model Predictive Control with Soft Constraints European Control Conference (ECC) July 7-9,, Zürich, Switzerland. Fast Model Predictive Control with Soft Constraints Arthur Richards Department of Aerospace Engineering, University of Bristol Queens Building,

More information

NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS. 1. Introduction. We consider first-order methods for smooth, unconstrained

NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS. 1. Introduction. We consider first-order methods for smooth, unconstrained NOTES ON FIRST-ORDER METHODS FOR MINIMIZING SMOOTH FUNCTIONS 1. Introduction. We consider first-order methods for smooth, unconstrained optimization: (1.1) minimize f(x), x R n where f : R n R. We assume

More information

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique

Master 2 MathBigData. 3 novembre CMAP - Ecole Polytechnique Master 2 MathBigData S. Gaïffas 1 3 novembre 2014 1 CMAP - Ecole Polytechnique 1 Supervised learning recap Introduction Loss functions, linearity 2 Penalization Introduction Ridge Sparsity Lasso 3 Some

More information

Sparse quadratic regulator

Sparse quadratic regulator 213 European Control Conference ECC) July 17-19, 213, Zürich, Switzerland. Sparse quadratic regulator Mihailo R. Jovanović and Fu Lin Abstract We consider a control design problem aimed at balancing quadratic

More information

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization / Uses of duality Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Remember conjugate functions Given f : R n R, the function is called its conjugate f (y) = max x R n yt x f(x) Conjugates appear

More information

An Optimization-based Approach to Decentralized Assignability

An Optimization-based Approach to Decentralized Assignability 2016 American Control Conference (ACC) Boston Marriott Copley Place July 6-8, 2016 Boston, MA, USA An Optimization-based Approach to Decentralized Assignability Alborz Alavian and Michael Rotkowitz Abstract

More information

Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems)

Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems) Accelerated Dual Gradient-Based Methods for Total Variation Image Denoising/Deblurring Problems (and other Inverse Problems) Donghwan Kim and Jeffrey A. Fessler EECS Department, University of Michigan

More information

Contraction Methods for Convex Optimization and monotone variational inequalities No.12

Contraction Methods for Convex Optimization and monotone variational inequalities No.12 XII - 1 Contraction Methods for Convex Optimization and monotone variational inequalities No.12 Linearized alternating direction methods of multipliers for separable convex programming Bingsheng He Department

More information

Krylov Integration Factor Method on Sparse Grids for High Spatial Dimension Convection Diffusion Equations

Krylov Integration Factor Method on Sparse Grids for High Spatial Dimension Convection Diffusion Equations DOI.7/s95-6-6-7 Krlov Integration Factor Method on Sparse Grids for High Spatial Dimension Convection Diffusion Equations Dong Lu Yong-Tao Zhang Received: September 5 / Revised: 9 March 6 / Accepted: 8

More information

Linear programming: Theory

Linear programming: Theory Division of the Humanities and Social Sciences Ec 181 KC Border Convex Analsis and Economic Theor Winter 2018 Topic 28: Linear programming: Theor 28.1 The saddlepoint theorem for linear programming The

More information

DELFT UNIVERSITY OF TECHNOLOGY

DELFT UNIVERSITY OF TECHNOLOGY DELFT UNIVERSITY OF TECHNOLOGY REPORT -09 Computational and Sensitivity Aspects of Eigenvalue-Based Methods for the Large-Scale Trust-Region Subproblem Marielba Rojas, Bjørn H. Fotland, and Trond Steihaug

More information

Lecture 7: Weak Duality

Lecture 7: Weak Duality EE 227A: Conve Optimization and Applications February 7, 2012 Lecture 7: Weak Duality Lecturer: Laurent El Ghaoui 7.1 Lagrange Dual problem 7.1.1 Primal problem In this section, we consider a possibly

More information

A Condensed and Sparse QP Formulation for Predictive Control

A Condensed and Sparse QP Formulation for Predictive Control 211 5th IEEE Conference on Decision and Control and European Control Conference (CDC-ECC) Orlando, FL, USA, December 12-15, 211 A Condensed and Sparse QP Formulation for Predictive Control Juan L Jerez,

More information

Lecture 16: October 22

Lecture 16: October 22 0-725/36-725: Conve Optimization Fall 208 Lecturer: Ryan Tibshirani Lecture 6: October 22 Scribes: Nic Dalmasso, Alan Mishler, Benja LeRoy Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:

More information

Dual Proximal Gradient Method

Dual Proximal Gradient Method Dual Proximal Gradient Method http://bicmr.pku.edu.cn/~wenzw/opt-2016-fall.html Acknowledgement: this slides is based on Prof. Lieven Vandenberghes lecture notes Outline 2/19 1 proximal gradient method

More information

I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION

I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION I P IANO : I NERTIAL P ROXIMAL A LGORITHM FOR N ON -C ONVEX O PTIMIZATION Peter Ochs University of Freiburg Germany 17.01.2017 joint work with: Thomas Brox and Thomas Pock c 2017 Peter Ochs ipiano c 1

More information

Preconditioning in Fast Dual Gradient Methods

Preconditioning in Fast Dual Gradient Methods 53rd I Conference on Decision and Control December 5-7, 204. os Angeles, California, USA Preconditioning in Fast Dual Gradient Methods Pontus Giselsson and Stephen Boyd Abstract First order optimization

More information

Fast model predictive control based on linear input/output models and bounded-variable least squares

Fast model predictive control based on linear input/output models and bounded-variable least squares 7 IEEE 56th Annual Conference on Decision and Control (CDC) December -5, 7, Melbourne, Australia Fast model predictive control based on linear input/output models and bounded-variable least squares Nilay

More information

You should be able to...

You should be able to... Lecture Outline Gradient Projection Algorithm Constant Step Length, Varying Step Length, Diminishing Step Length Complexity Issues Gradient Projection With Exploration Projection Solving QPs: active set

More information

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725

Proximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725 Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:

More information

Sparse Optimization Lecture: Basic Sparse Optimization Models

Sparse Optimization Lecture: Basic Sparse Optimization Models Sparse Optimization Lecture: Basic Sparse Optimization Models Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know basic l 1, l 2,1, and nuclear-norm

More information

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 XVI - 1 Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16 A slightly changed ADMM for convex optimization with three separable operators Bingsheng He Department of

More information

Sample-based Optimal Transport and Barycenter Problems

Sample-based Optimal Transport and Barycenter Problems Sample-based Optimal Transport and Barcenter Problems MAX UANG New York Universit, Courant Institute of Mathematical Sciences AND ESTEBAN G. TABA New York Universit, Courant Institute of Mathematical Sciences

More information

10-725/36-725: Convex Optimization Spring Lecture 21: April 6

10-725/36-725: Convex Optimization Spring Lecture 21: April 6 10-725/36-725: Conve Optimization Spring 2015 Lecturer: Ryan Tibshirani Lecture 21: April 6 Scribes: Chiqun Zhang, Hanqi Cheng, Waleed Ammar Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer:

More information

Dual and primal-dual methods

Dual and primal-dual methods ELE 538B: Large-Scale Optimization for Data Science Dual and primal-dual methods Yuxin Chen Princeton University, Spring 2018 Outline Dual proximal gradient method Primal-dual proximal gradient method

More information

ANONSINGULAR tridiagonal linear system of the form

ANONSINGULAR tridiagonal linear system of the form Generalized Diagonal Pivoting Methods for Tridiagonal Systems without Interchanges Jennifer B. Erway, Roummel F. Marcia, and Joseph A. Tyson Abstract It has been shown that a nonsingular symmetric tridiagonal

More information

ABOUT THE WEAK EFFICIENCIES IN VECTOR OPTIMIZATION

ABOUT THE WEAK EFFICIENCIES IN VECTOR OPTIMIZATION ABOUT THE WEAK EFFICIENCIES IN VECTOR OPTIMIZATION CRISTINA STAMATE Institute of Mathematics 8, Carol I street, Iasi 6600, Romania cstamate@mail.com Abstract: We present the principal properties of the

More information

Sparse Optimization Lecture: Dual Methods, Part I

Sparse Optimization Lecture: Dual Methods, Part I Sparse Optimization Lecture: Dual Methods, Part I Instructor: Wotao Yin July 2013 online discussions on piazza.com Those who complete this lecture will know dual (sub)gradient iteration augmented l 1 iteration

More information

Primal-dual Subgradient Method for Convex Problems with Functional Constraints

Primal-dual Subgradient Method for Convex Problems with Functional Constraints Primal-dual Subgradient Method for Convex Problems with Functional Constraints Yurii Nesterov, CORE/INMA (UCL) Workshop on embedded optimization EMBOPT2014 September 9, 2014 (Lucca) Yu. Nesterov Primal-dual

More information

9. Dual decomposition and dual algorithms

9. Dual decomposition and dual algorithms EE 546, Univ of Washington, Spring 2016 9. Dual decomposition and dual algorithms dual gradient ascent example: network rate control dual decomposition and the proximal gradient method examples with simple

More information

Big Data Analytics: Optimization and Randomization

Big Data Analytics: Optimization and Randomization Big Data Analytics: Optimization and Randomization Tianbao Yang Tutorial@ACML 2015 Hong Kong Department of Computer Science, The University of Iowa, IA, USA Nov. 20, 2015 Yang Tutorial for ACML 15 Nov.

More information

arxiv: v1 [math.oc] 23 May 2017

arxiv: v1 [math.oc] 23 May 2017 A DERANDOMIZED ALGORITHM FOR RP-ADMM WITH SYMMETRIC GAUSS-SEIDEL METHOD JINCHAO XU, KAILAI XU, AND YINYU YE arxiv:1705.08389v1 [math.oc] 23 May 2017 Abstract. For multi-block alternating direction method

More information

Constrained Optimization and Lagrangian Duality

Constrained Optimization and Lagrangian Duality CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may

More information

Sparse Covariance Selection using Semidefinite Programming

Sparse Covariance Selection using Semidefinite Programming Sparse Covariance Selection using Semidefinite Programming A. d Aspremont ORFE, Princeton University Joint work with O. Banerjee, L. El Ghaoui & G. Natsoulis, U.C. Berkeley & Iconix Pharmaceuticals Support

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley A. d Aspremont, INFORMS, Denver,

More information

A direct formulation for sparse PCA using semidefinite programming

A direct formulation for sparse PCA using semidefinite programming A direct formulation for sparse PCA using semidefinite programming A. d Aspremont, L. El Ghaoui, M. Jordan, G. Lanckriet ORFE, Princeton University & EECS, U.C. Berkeley Available online at www.princeton.edu/~aspremon

More information

Theory of two-dimensional transformations

Theory of two-dimensional transformations Calhoun: The NPS Institutional Archive DSpace Repositor Facult and Researchers Facult and Researchers Collection 1998-1 Theor of two-dimensional transformations Kanaama, Yutaka J.; Krahn, Gar W. http://hdl.handle.net/1945/419

More information

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6)

EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6) EE 367 / CS 448I Computational Imaging and Display Notes: Image Deconvolution (lecture 6) Gordon Wetzstein gordon.wetzstein@stanford.edu This document serves as a supplement to the material discussed in

More information

15. Eigenvalues, Eigenvectors

15. Eigenvalues, Eigenvectors 5 Eigenvalues, Eigenvectors Matri of a Linear Transformation Consider a linear ( transformation ) L : a b R 2 R 2 Suppose we know that L and L Then c d because of linearit, we can determine what L does

More information

4 Inverse function theorem

4 Inverse function theorem Tel Aviv Universit, 2013/14 Analsis-III,IV 53 4 Inverse function theorem 4a What is the problem................ 53 4b Simple observations before the theorem..... 54 4c The theorem.....................

More information

Eigenvectors and Eigenvalues 1

Eigenvectors and Eigenvalues 1 Ma 2015 page 1 Eigenvectors and Eigenvalues 1 In this handout, we will eplore eigenvectors and eigenvalues. We will begin with an eploration, then provide some direct eplanation and worked eamples, and

More information

The Alternating Direction Method of Multipliers

The Alternating Direction Method of Multipliers The Alternating Direction Method of Multipliers With Adaptive Step Size Selection Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein December 2, 2015 1 / 25 Background The Dual Problem Consider

More information

On the Moreau-Yosida regularization of the vector k-norm related functions

On the Moreau-Yosida regularization of the vector k-norm related functions On the Moreau-Yosida regularization of the vector k-norm related functions Bin Wu, Chao Ding, Defeng Sun and Kim-Chuan Toh This version: March 08, 2011 Abstract In this paper, we conduct a thorough study

More information

A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices

A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices A Fast Augmented Lagrangian Algorithm for Learning Low-Rank Matrices Ryota Tomioka 1, Taiji Suzuki 1, Masashi Sugiyama 2, Hisashi Kashima 1 1 The University of Tokyo 2 Tokyo Institute of Technology 2010-06-22

More information

Simultaneous Orthogonal Rotations Angle

Simultaneous Orthogonal Rotations Angle ELEKTROTEHNIŠKI VESTNIK 8(1-2): -11, 2011 ENGLISH EDITION Simultaneous Orthogonal Rotations Angle Sašo Tomažič 1, Sara Stančin 2 Facult of Electrical Engineering, Universit of Ljubljana 1 E-mail: saso.tomaic@fe.uni-lj.si

More information

Separable Model Predictive Control via Alternating Direction Method of Multipliers for Large-scale Systems

Separable Model Predictive Control via Alternating Direction Method of Multipliers for Large-scale Systems Preprints of the 9th World Congress The International Federation of Automatic Control Cape Town, South Africa August 4-9, 04 Separable Model Predictive Control via Alternating Direction Method of Multipliers

More information

Accelerated primal-dual methods for linearly constrained convex problems

Accelerated primal-dual methods for linearly constrained convex problems Accelerated primal-dual methods for linearly constrained convex problems Yangyang Xu SIAM Conference on Optimization May 24, 2017 1 / 23 Accelerated proximal gradient For convex composite problem: minimize

More information

On the interior of the simplex, we have the Hessian of d(x), Hd(x) is diagonal with ith. µd(w) + w T c. minimize. subject to w T 1 = 1,

On the interior of the simplex, we have the Hessian of d(x), Hd(x) is diagonal with ith. µd(w) + w T c. minimize. subject to w T 1 = 1, Math 30 Winter 05 Solution to Homework 3. Recognizing the convexity of g(x) := x log x, from Jensen s inequality we get d(x) n x + + x n n log x + + x n n where the equality is attained only at x = (/n,...,

More information

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented

More information

Solving Variable-Coefficient Fourth-Order ODEs with Polynomial Nonlinearity by Symmetric Homotopy Method

Solving Variable-Coefficient Fourth-Order ODEs with Polynomial Nonlinearity by Symmetric Homotopy Method Applied and Computational Mathematics 218; 7(2): 58-7 http://www.sciencepublishinggroup.com/j/acm doi: 1.11648/j.acm.21872.14 ISSN: 2328-565 (Print); ISSN: 2328-5613 (Online) Solving Variable-Coefficient

More information

An Optimization Method for Numerically Solving Three-point BVPs of Linear Second-order ODEs with Variable Coefficients

An Optimization Method for Numerically Solving Three-point BVPs of Linear Second-order ODEs with Variable Coefficients An Optimization Method for Numericall Solving Three-point BVPs of Linear Second-order ODEs with Variable Coefficients Liaocheng Universit School of Mathematics Sciences 252059, Liaocheng P.R. CHINA hougbb@26.com;

More information

1 Kernel methods & optimization

1 Kernel methods & optimization Machine Learning Class Notes 9-26-13 Prof. David Sontag 1 Kernel methods & optimization One eample of a kernel that is frequently used in practice and which allows for highly non-linear discriminant functions

More information

Optimization methods

Optimization methods Lecture notes 3 February 8, 016 1 Introduction Optimization methods In these notes we provide an overview of a selection of optimization methods. We focus on methods which rely on first-order information,

More information

6. Vector Random Variables

6. Vector Random Variables 6. Vector Random Variables In the previous chapter we presented methods for dealing with two random variables. In this chapter we etend these methods to the case of n random variables in the following

More information

Optimal Step-Size Selection in Alternating Direction Method of Multipliers for Convex Quadratic Programs and Model Predictive Control

Optimal Step-Size Selection in Alternating Direction Method of Multipliers for Convex Quadratic Programs and Model Predictive Control MITSUBISHI ELECTRIC RESEARCH LABORATORIES http://www.merl.com Optimal Step-Size Selection in Alternating Direction Method of Multipliers for Convex Quadratic Programs and Model Predictive Control Raghunathan,

More information

FAST ALTERNATING DIRECTION OPTIMIZATION METHODS

FAST ALTERNATING DIRECTION OPTIMIZATION METHODS FAST ALTERNATING DIRECTION OPTIMIZATION METHODS TOM GOLDSTEIN, BRENDAN O DONOGHUE, SIMON SETZER, AND RICHARD BARANIUK Abstract. Alternating direction methods are a common tool for general mathematical

More information

A set C R n is convex if and only if δ C is convex if and only if. f : R n R is strictly convex if and only if f is convex and the inequality (1.

A set C R n is convex if and only if δ C is convex if and only if. f : R n R is strictly convex if and only if f is convex and the inequality (1. ONVEX OPTIMIZATION SUMMARY ANDREW TULLOH 1. Eistence Definition. For R n, define δ as 0 δ () = / (1.1) Note minimizes f over if and onl if minimizes f + δ over R n. Definition. (ii) (i) dom f = R n f()

More information

Adaptive Nonlinear Model Predictive Control with Suboptimality and Stability Guarantees

Adaptive Nonlinear Model Predictive Control with Suboptimality and Stability Guarantees Adaptive Nonlinear Model Predictive Control with Suboptimality and Stability Guarantees Pontus Giselsson Department of Automatic Control LTH Lund University Box 118, SE-221 00 Lund, Sweden pontusg@control.lth.se

More information

Tensor products in Riesz space theory

Tensor products in Riesz space theory Tensor products in Riesz space theor Jan van Waaij Master thesis defended on Jul 16, 2013 Thesis advisors dr. O.W. van Gaans dr. M.F.E. de Jeu Mathematical Institute, Universit of Leiden CONTENTS 2 Contents

More information

MS&E 318 (CME 338) Large-Scale Numerical Optimization

MS&E 318 (CME 338) Large-Scale Numerical Optimization Stanford University, Management Science & Engineering (and ICME) MS&E 318 (CME 338) Large-Scale Numerical Optimization 1 Origins Instructor: Michael Saunders Spring 2015 Notes 9: Augmented Lagrangian Methods

More information

Fully Discrete Energy Stable High Order Finite Difference Methods for Hyperbolic Problems in Deforming Domains: An Initial Investigation

Fully Discrete Energy Stable High Order Finite Difference Methods for Hyperbolic Problems in Deforming Domains: An Initial Investigation Full Discrete Energ Stable High Order Finite Difference Methods for Hperbolic Problems in Deforming Domains: An Initial Investigation Samira Nikkar and Jan Nordström Abstract A time-dependent coordinate

More information

Distributed and Real-time Predictive Control

Distributed and Real-time Predictive Control Distributed and Real-time Predictive Control Melanie Zeilinger Christian Conte (ETH) Alexander Domahidi (ETH) Ye Pu (EPFL) Colin Jones (EPFL) Challenges in modern control systems Power system: - Frequency

More information

On convergence rate of the Douglas-Rachford operator splitting method

On convergence rate of the Douglas-Rachford operator splitting method On convergence rate of the Douglas-Rachford operator splitting method Bingsheng He and Xiaoming Yuan 2 Abstract. This note provides a simple proof on a O(/k) convergence rate for the Douglas- Rachford

More information

Lecture 3: Latent Variables Models and Learning with the EM Algorithm. Sam Roweis. Tuesday July25, 2006 Machine Learning Summer School, Taiwan

Lecture 3: Latent Variables Models and Learning with the EM Algorithm. Sam Roweis. Tuesday July25, 2006 Machine Learning Summer School, Taiwan Lecture 3: Latent Variables Models and Learning with the EM Algorithm Sam Roweis Tuesday July25, 2006 Machine Learning Summer School, Taiwan Latent Variable Models What to do when a variable z is always

More information

Douglas-Rachford Splitting: Complexity Estimates and Accelerated Variants

Douglas-Rachford Splitting: Complexity Estimates and Accelerated Variants 53rd IEEE Conference on Decision and Control December 5-7, 204. Los Angeles, California, USA Douglas-Rachford Splitting: Complexity Estimates and Accelerated Variants Panagiotis Patrinos and Lorenzo Stella

More information

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17 EE/ACM 150 - Applications of Convex Optimization in Signal Processing and Communications Lecture 17 Andre Tkacenko Signal Processing Research Group Jet Propulsion Laboratory May 29, 2012 Andre Tkacenko

More information

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem

Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R

More information

Operators. Pontus Giselsson

Operators. Pontus Giselsson Operators Pontus Giselsson 1 Today s lecture forward step operators gradient step operators (subclass of forward step operators) resolvents proimal operators (subclass of resolvents) reflected resolvents

More information

Operations Research Letters

Operations Research Letters Operations Research Letters 37 (2009) 1 6 Contents lists available at ScienceDirect Operations Research Letters journal homepage: www.elsevier.com/locate/orl Duality in robust optimization: Primal worst

More information

Chapter Adequacy of Solutions

Chapter Adequacy of Solutions Chapter 04.09 dequac of Solutions fter reading this chapter, ou should be able to: 1. know the difference between ill-conditioned and well-conditioned sstems of equations,. define the norm of a matri,

More information

A Smoothing SQP Framework for a Class of Composite L q Minimization over Polyhedron

A Smoothing SQP Framework for a Class of Composite L q Minimization over Polyhedron Noname manuscript No. (will be inserted by the editor) A Smoothing SQP Framework for a Class of Composite L q Minimization over Polyhedron Ya-Feng Liu Shiqian Ma Yu-Hong Dai Shuzhong Zhang Received: date

More information

The Entropy Power Inequality and Mrs. Gerber s Lemma for Groups of order 2 n

The Entropy Power Inequality and Mrs. Gerber s Lemma for Groups of order 2 n The Entrop Power Inequalit and Mrs. Gerber s Lemma for Groups of order 2 n Varun Jog EECS, UC Berkele Berkele, CA-94720 Email: varunjog@eecs.berkele.edu Venkat Anantharam EECS, UC Berkele Berkele, CA-94720

More information

Lasso: Algorithms and Extensions

Lasso: Algorithms and Extensions ELE 538B: Sparsity, Structure and Inference Lasso: Algorithms and Extensions Yuxin Chen Princeton University, Spring 2017 Outline Proximal operators Proximal gradient methods for lasso and its extensions

More information

Computation fundamentals of discrete GMRF representations of continuous domain spatial models

Computation fundamentals of discrete GMRF representations of continuous domain spatial models Computation fundamentals of discrete GMRF representations of continuous domain spatial models Finn Lindgren September 23 2015 v0.2.2 Abstract The fundamental formulas and algorithms for Bayesian spatial

More information

CONSERVATION LAWS AND CONSERVED QUANTITIES FOR LAMINAR RADIAL JETS WITH SWIRL

CONSERVATION LAWS AND CONSERVED QUANTITIES FOR LAMINAR RADIAL JETS WITH SWIRL Mathematical and Computational Applications,Vol. 15, No. 4, pp. 742-761, 21. c Association for Scientific Research CONSERVATION LAWS AND CONSERVED QUANTITIES FOR LAMINAR RADIAL JETS WITH SWIRL R. Naz 1,

More information

Research Article On Sharp Triangle Inequalities in Banach Spaces II

Research Article On Sharp Triangle Inequalities in Banach Spaces II Hindawi Publishing Corporation Journal of Inequalities and Applications Volume 2010, Article ID 323609, 17 pages doi:10.1155/2010/323609 Research Article On Sharp Triangle Inequalities in Banach Spaces

More information

LOCAL LINEAR CONVERGENCE OF ADMM Daniel Boley

LOCAL LINEAR CONVERGENCE OF ADMM Daniel Boley LOCAL LINEAR CONVERGENCE OF ADMM Daniel Boley Model QP/LP: min 1 / 2 x T Qx+c T x s.t. Ax = b, x 0, (1) Lagrangian: L(x,y) = 1 / 2 x T Qx+c T x y T x s.t. Ax = b, (2) where y 0 is the vector of Lagrange

More information

Lecture 23: Conditional Gradient Method

Lecture 23: Conditional Gradient Method 10-725/36-725: Conve Optimization Spring 2015 Lecture 23: Conditional Gradient Method Lecturer: Ryan Tibshirani Scribes: Shichao Yang,Diyi Yang,Zhanpeng Fang Note: LaTeX template courtesy of UC Berkeley

More information

Simple and Certifiable Quadratic Programming Algorithms for Embedded Linear Model Predictive Control

Simple and Certifiable Quadratic Programming Algorithms for Embedded Linear Model Predictive Control Preprints of 4th IFAC Nonlinear Model Predictive Control Conference International Federation of Automatic Control Simple and Certifiable Quadratic Programming Algorithms for Embedded Linear Model Predictive

More information

Optimization for Machine Learning

Optimization for Machine Learning Optimization for Machine Learning (Problems; Algorithms - A) SUVRIT SRA Massachusetts Institute of Technology PKU Summer School on Data Science (July 2017) Course materials http://suvrit.de/teaching.html

More information

Preconditioning via Diagonal Scaling

Preconditioning via Diagonal Scaling Preconditioning via Diagonal Scaling Reza Takapoui Hamid Javadi June 4, 2014 1 Introduction Interior point methods solve small to medium sized problems to high accuracy in a reasonable amount of time.

More information