Generalized Derivatives Automatic Evaluation & Implications for Algorithms Paul I. Barton, Kamil A. Khan & Harry A. J. Watson
|
|
- George Ross
- 5 years ago
- Views:
Transcription
1 Generalized Derivatives Automatic Evaluation & Implications for Algorithms Paul I. Barton, Kamil A. Khan & Harry A. J. Watson Process Systems Engineering Laboratory Massachusetts Institute of Technology
2 Nonsmooth Equation Solving Semismooth Newton method: Linear programming (LP) Newton method: min γ,x γ G(x k )(x x k ) = f(x k ) s.t. f(x k ) + G(x k )(x x k ) γ f(x k ) 2 G(x k ) (x x k ) γ f(x k ) x X Polyhedral set some element of a generalized derivative Kojima & Shindo (1986), Qi & Sun (1993), Facchinei, Fischer & Herrich (2014). 2
3 Generalized Derivatives Suppose locally Lipschitz => differentiable on a set S B-subdifferential: Clarke Jacobian: f B f(x) := {H : H = lim i f(x) := conv B f(x) f (x) = x Jf(x (i) ), x = lim x (i), x (i) S} i f (x) = { 1} f (x) = {1} x B f (x) = { 1,1}, f (x) = [ 1,1] Useful properties of f(x) : Clarke (1973). Ø Nonempty, convex, and compact Ø Satisfies mean-value theorem, implicit/inverse function theorems Ø Reduces to subdifferential/derivative when f is convex/strictly differentiable 3
4 Convergence Properties Suppose generalized derivative contains no singular matrices at the solution Semismooth Newton method: G(x k ) f(x k ) Ø local Q-superlinear convergence if Ø local Q-quadratic convergence if strongly semismooth Semismooth Newton & LP-Newton methods for PC 1 or strongly semismooth functions: Ø local Q-quadratic convergence if G(x k ) B f(x k ) Automatic/Algorithmic Differentiation (AD) Ø Automatic methods for computing derivatives in complex settings Ø Automatic method for computing elements of generalized derivatives? Ø Computationally relevant generalized derivatives 4
5 All generalized derivatives are equal But, some are more equal than others. 5
6 Obstacles to Automatic Gen. Derivative Evaluation 1 Automatically evaluating Clarke Jacobian elements is difficult Lack of sharp calculus rules: g(x) = max{0, x} h(x) = min{0, x} f (x) = g(x)+ h(x) x x x 0 g(0) = [0,1] 0 h(0) = [0,1] (0 + 0) f (0) = {1} f (0) g(0) + h(0) 6
7 Directional Derivatives & PC 1 Functions Directional derivative: f '(x;d) = lim t 0 + f(x + td) f(x) t Sharp chain rule for locally Lipschitz functions: [f!g]'(x;d) = f '(g(x);g'(x;d)) AD gives the directional derivative PC 1 functions: finite collection of C 1 functions for which f(y) { φ(y) :φ F f (x)}, y N(x) 2-norm not PC 1 Griewank (1994), Scholtes (2012). 7
8 Obstacles 2 PC 1 functions have piecewise linear directional derivative f (x;d) = B (1) d d 2 f (x;d) = B (2) d d 1 f (x;d) = B (3) d 8
9 Obstacles 2 PC 1 functions have piecewise linear directional derivative f (x;d) = B (1) d d 2 f (x;d) = B (2) d d 1 f (x;d) = B (3) d Directional derivatives in the coordinate directions do not necessarily give B- subdifferential elements Also defeats finite differences 9
10 Obstacles 3 f(x) may be a strict subset of m i=1 f i (x) f : (x 1, x 2 )! x + x 1 2 f(0) = x 1 x 2 f 1 (0) f 2 (0) = π 2 f(0) s 1 1 2s : s 0,1 2s : (s 2s 2 1 1,s 2 ) 0,1 π 2 ( f 1 (0) f 2 (0)) 10
11 L-smooth Functions The following functions f : X R n R m are L-smooth: Ø Continuously differentiable functions Ø Convex functions (e.g. abs, 2-norm) Ø PC 1 functions Ø Compositions of L-smooth functions: Ø Integrals of L-smooth functions: Ø Solutions of ODEs with L-smooth right-hand sides: where dx dt x! b a g(t,x) dt c! x(b,c), (t,c) = g(t,x(t,c)), x(0,c) = c x! h(g(x)) Nesterov (1987), Khan and Barton (2014), Khan and Barton (2015). 11
12 Lexicographic Derivatives L-subdifferential: L f(x) = {J L f(x;m) :det M 0} Ø Contains L-derivatives in directions M: J L f(x;m), det M 0 Useful properties: Ø L-derivatives classical derivative wherever strictly differentiable Ø L-derivatives elements of Clarke gradient Ø Contains only subgradients when f convex Ø Contained in plenary hull of Clarke Jacobian, and can be used in place of Clarke Jacobian in numerical methods: {Ad : A L f(x)} {Ad : A f(x)} for each d R n Ø For PC 1 functions, L-derivatives elements of B-subdifferential Ø Satisfies sharp chain rule, expressed naturally using LD-derivatives Nesterov (1987), Khan and Barton (2014), Khan and Barton (2015). 12
13 Lexicographic Directional (LD)-Derivatives Extension of classical directional derivative LD-derivative: for any M := [m (1)! m ( p) ] R n p, f '(x;m) = [f (0) ( x,m (m (1) )! f p 1) x,m (m ( p) )] If M is square and nonsingular: f '(x;m) = J L f(x;m)m If f is differentiable at x: f '(x;m) = Jf(x)M Sharp LD-derivative chain rule: Khan and Barton (2015). [f! g]'(x;m) = f '(g(x);g'(x;m)) 13
14 Vector Forward AD Mode for LD-derivatives Sharp chain rule immediately implies, given the seed directions M, forward-mode AD can compute: Need calculus rules for elementary functions : Ø abs, min, max, mid, f '(x;m), etc. Ø algorithm for elemental PC 1 functions Ø linear programs and lexicographic linear programs parameterized by their RHSs Ø implicit function: w'(ẑ;m) is the unique solution N of 2 h(w(z),z) = 0 ( ) = 0 h' (ŷ,ẑ);(n,m) Khan and Barton (2015), Khan and Barton (2013), Hoeffner et al. (2015). 14
15 Semismooth Inexact Newton Method Inexact Newton method: Solve iteratively: But, directional derivative not a linear function of the directions Let, M nonsingular. Then: But, M not known in advance Compute columns of one at time M = d 1,d 2, J(x)d i, i = 1,2, J L f(x;m) Δx = f(x) f '(x;m) = J L f(x;m)m f '(x;m) Ø computation of a column affects subsequent columns Ø automatic code can be locked to record influence of earlier columns Local Q-superlinear & Q-quadratic convergence rates can be achieved 15
16 Approximation of LD-derivatives using FDs LD-derivative: M := [m (1)! m ( p) ] R n p f '(x;m) = [f (0) ( x,m (m (1) )! f p 1) x,m (m ( p) )] FD approx. of f '(x;m) using p+1 function evaluations: f (0) x,m (m (1) ) α 1 [f(x + αm (1) ) f(x)] =: D αm(1) [f](x) f (1) x,m (m (2) ) D αm( [f (0) 2) x,m ](m (1) ) = D αm( D αm(1) [f](x) 2)! ( f p 1) ( x,m (m ( p) ) D αm( [f p 2) p ) x,m ](m ( p 1) ) = D αm( "D αm( D αm(1) [f](x) p ) 2) f '(x;m) = [f (0) x,m (m (1) ) f (1) x,m (m (2) )] f (0) x,m (m (1) ) α 1 [f(x + αm (1) ) f(x)] f (1) x,m (m (2) ) α 2 [f(x + αm (1) + α 2 m (2) ) f(x + αm (1) )] x x + αm (1) + α 2 m (2) x + αm (1) 16
17 Sparse Accumulation for L- derivatives Cost of AD can be reduced when the Jacobian is sparse Ø Find structurally orthogonal columns Ø Perform vector forward pass with seed matrix rather than AD for LD-derivatives à order of the directions matters Ø Corresponding to M is an uncompressed (permutation) matrix Q:» M = QD for some matrix D Ø Procedure: Ø a b 0 0 c 0 d 0 0 e 0 f 0 0 g h» Identify matrices Q, D, and M is not true in general M ϒ n p I ϒ n n » Perform vector forward pass to calculate f (x;m)» Copy entries of f (x;m) into entries of sparse data structure for f (x;q) Done based on assumption that f (x;m) = f (x;q)d» Calculate J (i.e. by sparse permutation) L f(x;q) = f (x;q)q 1 f (x;m) = f (x;q)d 17
18 Generalized Derivatives of Algorithms: MHEX model F 1,T 1 in! out F H,T H f 1,t 1 in MHEX!! out f C,t C F 1,T 1 out! out F H,T H f 1,t 1 out out f C,t C Watson et al. (2015). F ( i T in out i T ) i = f j t out in i t i i H min p P j C ( ) ( EBP p p C EBP ) H = 0 UA ΔQ k = 0 k K k K k ΔT LM 18
19 ODEs and BVPs dx dt (t,c) = f(t,x(t,c)), x(t 0,c) = c The LD-derivative mapping t! [x t ]'(c 0 ;M) uniquely solves the ODE: da dt (t) = [f t ]'(x(t,c 0 );A(t)), A(t 0 ) = M Boundary value problem: 0 = F(c,x(t f,c)) Solve with semismooth (inexact) Newton method using chain rule for LD-derivatives Ø if in addition f is semismooth, then Q-superlinear convergence rate Ø if it happens to be PC 1, then Q-quadratic convergence rate Khan and Barton (2014), Khan and Barton (2015), Pang and Stewart (2009). 19
20 Conclusions L-derivatives computationally relevant generalized derivatives Can be computed automatically for broad classes of functions Strong theory gives practically computable generalized derivatives for: Ø Implicit functions Ø Algorithms Ø ODE solutions Ø Linear programs Ø etc. 20
21 Acknowledgments Peter Stechlinski Novartis Statoil 21
22 Obstacles 3 f(x) may be a strict subset of m i=1 f i (x) f : (x 1, x 2 )! x + x f(0) = x 1 x f 1 (0) f 2 (0) = 1 2s 1 π 2 f(0) = : s 0,1 1 2s 2s 1 1 2s : s 0,1 2s : (s 2s 2 1 1,s 2 ) 0,1 π 2 ( f 1 (0) f 2 (0)) = 2s 1 1 2s 2 1 : (s, s ) 0,
23 Lexicographic differentiation [1] f : X R n R m is L-smooth at x X if it is locally Lipschitz continuous and directionally differentiable, and if, for any M := [m (1)! m ( p) ] R n p, the following functions exist: (0) f x,m (1) f x,m :d! f '(x;d) :d! [f (0) x,m ]'(m (1) ;d) ( f p) ( x,m :d! [f p 1) x,m ]'(m ( p) ;d) If the columns of M span R n ( p), then f x,m is linear " J L f(x;m) := Jf ( p) x,m (0) Lexicographic subdifferential: L f(x) = {J L f(x;m) : M R n n, det M 0} The class of L-smooth functions is closed under composition, and includes all smooth functions and all convex functions [1]: Y. Nesterov, Math. Program. B, 104 (2005)
24 Inverse and implicit functions LD-derivatives for inverse functions: Ø Suppose f is L-smooth and locally invertible near ŷ, f(ŷ) = ẑ, and f 1 is Lipschitz near ẑ Ø Result: f 1 is also L-smooth at ẑ Ø Result: [f 1 ]'(ẑ;m) is the unique solution N of : f '(ŷ;n) = M LD-derivatives for implicit functions: Ø Suppose h is L-smooth, h(ŷ, ẑ) = 0, and there exists an implicit function w such that Ø h(w(z), z) = 0, for each z near ẑ Ø Result: w is L-smooth at ẑ Ø Result: w'(ẑ;m) is the unique solution N of: '! h' # ) ("# ŷ ẑ $ & %& ;! # " N M $ * & %, = 0 + 1: Khan and Barton, submitted 24
25 Vector Forward AD Mode for LD-derivatives f :! 2! :(x, y) " max{x, y, x, y} g :!! 2 : x " (x,x) h :!! : z " f #g Z f = {(x, y)! 2 : y = x or y = x} z g(!) Z f h(z) = f (g(z)) = f (z, z) = max{z, z, z, z}= z 25
26 Generalized Derivatives of Algorithms: MHEX model F 1,T 1 in! out F H,T H f 1,t 1 in MHEX!! out f C,t C F 1,T 1 out! out F H,T H f 1,t 1 out out f C,t C F ( i T in out i T ) i = f j t out in i t i i H min p P j C ( ) ( EBP p p C EBP ) H = 0 UA ΔQ k = 0 k K k K k ΔT LM 26
27 Formulating a PC 1 Area Constraint Consider the set of points which are either kinks or endpoints on the composite curves Ø Index this set of points with set Ø Each k K has an associated enthalpy value K k Q T Q = Q 1 2 k Q Q k 1 Q + K Q 27
28 Formulating a PC 1 Area Constraint If both temperatures at adjacent points are known, the interval can be treated as a twostream heat exchanger T k k 1 T + T ΔQ k = UA k k ΔT LM k t k 1 t + k ΔQ Q = Q 1 2 k Q Q k 1 Q + K Q 28
29 Formulating a PC 1 Area Constraint Summing the area over all intervals gives the total MHEX area: ΔQ k UA = k K k K k ΔT LM T k ΔQ Q = Q 1 2 k Q Q k 1 Q + K Q 29
30 Formulating a PC 1 Area Constraint Difficulties: Ø The enthalpies and temperatures need to be sorted Ø Not all of the temperatures are known Consider (naïve) bubble sort: Only calculations involve taking min/max of two entries Same sequence and number of calculations regardless of whether the input is well-sorted or not Naïve bubble sort is an composite PC 1 function 30
31 Calculating Unknown Temperatures Finding the unknown temperatures involves solving one k k of these equations for : out in ( ) k k ht (, y, Q ( y) ) = 0 in out ( max{0, } max{0, }) k k k Q ( y) F max{0, T T } max{0, T T } = 0 i H i i i k k k Q ( y) f t t t t = 0 j C j j j Easily solved using if-else logic T Ø Correctly calculates values of the unknown temperatures, but not their LD-derivatives ß not composite PC 1 Solution of the equation defines an implicit function η :! n y!! : k k k k ht (, y, Q ( y) ) = 0 T ( y) =η ( y, Q ( y) ) Generalization of the implicit k k T '( yˆ ; I) = η ( yˆ, Q ( yˆ )); function theorem Q, t k I '( yˆ ; I) 31
32 Calculating Temperature Driving Force and Area Once all temperatures are found, the driving force for each interval can be calculated Continuously differentiable elemental function Ø Requires modification of standard log-mean temperature difference calculation: ( + 1, ) Δ T = max{ ΔT, T t } k k k min Δ T = max{ ΔT, T t } k+ 1 k+ 1 k+ 1 min k+ 1 k T T k Δ +Δ if Δ T =Δ k k k 2 ΔTLM ΔT Δ T = k+ 1 k ΔT ΔT k+ 1 k ln( ΔT ) ln( ΔT ) otherwise T k+ 1 Finally, calculate the total area using: k Q UA = Δ k k K ΔT LM k K 32
33 Formulating the ΔT min Equation New extended pinch operator formulation: ( { } EBP C p = f j max 0,(T p ΔT min ) t j in j C max{ 0,(T p out ΔT min ) t } j + max{ 0,(T p ΔT min ) t max } max 0,t min (T p ΔT min ) ( { } EBP H p = F i max 0,T p T i out i H { }) max 0,T p in { T i } max{ 0,T min T p }+ max 0,T p T max { }) Relevant equation for the new formulation: min p P ( EBP p p C EBP ) H = 0 33
34 Complete Formulation F i T in out ( i T ) i f j t out in i t i i H j C min p P ( ) = 0 ( EBP p p C EBP ) H = 0 UA T i out t j out ΔQ k = 0 k k K k K ΔT LM T i in, i H t j in, j C 3 equations (plus constraints on outlet temperatures) Solve for 3 unknown quantities using LP-Newton method (temperatures, flow rates, area, minimum approach temperature, etc.) 34
35 LNG process case study Two MHEX models plus three intermediate compression/expansion operations Ø 9 equations (4 nonsmooth) à solve for nine variables 35
36 LNG process data and variables Seven temperatures are taken as unknown in each of the following cases Ø 2 additional variables can be taken as unknowns (UA, approach temperature, more temperatures, etc. 36
37 LNG process example: Case I Δ T = 4 K min UA =? ΔTmin = 4K UA =? Given the minimum approach temperature, calculate the area 37
38 LNG process example: Case I HX-100: UA = kw/k HX-101: UA = kw/k y 1 y 2 y 3 y 4 y 5 y 6 y K K K K K K K 38
39 Empirical Convergent Rate k f( y ) Iteration 39
40 LNG process example: Case II ΔT = ΔT min =? min? UA = 120 (kw/k) UA = 30 (kw/k) Given the area, calculate the minimum approach temperature 40
41 LNG process example: Case II HX-100: ΔT = 2.62 K min HX-101: ΔT =1.26 K min y 1 y 2 y 3 Y 4 y 5 y 6 y K K K K K K K 41
42 LNG process example: Case III Δ T = 4K Δ Tmin = 4 K min UA = 85 UA = 35 (kw/k) (kw/k) Given the area and the minimum approach temperature, calculate (more) information about the streams 42
43 LNG process example: Case III T = out H K t out = C3 K y 1 y 2 y 3 y 4 y 5 y 6 y K K K K K K K 43
44 PC RHS with non-pc Solution!x = y,!y = x,!z = x, t 1 1 x(0) = x y(0) = y z(0) = z z(t) = z 0 + x 0 cos(s) + y 0 sin(s) ds 0 t = z 0 + (rcos(θ)) cos(s) + (rsin(θ))sin(s) ds 0 = z 0 + x y 0 2 t 0 cos(θ s) ds z 2π (x 0, y 0,0) y 0 x 0 z t (x 0 ) for t [0,10] ( y 0 = 0, z 0 = 0) z 2πk (x 0, y 0,0) = f (k) x y 0 2 positive, independent of θ 44
45 Existing dynamic sensitivities 45
46 Existing dynamic sensitivities 5 6 x(t,c) 0 Γ[x(2, )](t) t t 46
47 Dynamic LD-derivatives 47
48 Example 1 x(t,c) t Generalized derivative bounds LNA bounds Lex. deriv. bounds Time (t) 48
49 Singleton & Nonsingleton Trajectories OK Not OK OK 49
Computationally Relevant Generalized Derivatives Theory, Evaluation & Applications
Computationally Relevant Generalized Derivatives Theory, Evaluation & Applications Paul I. Barton, Kamil A. Khan, Jose A. Gomez, Kai Hoeffner, Peter Stechlinski, Paul Tranquilli & Harry A. J. Watson Process
More informationGeneralized Derivatives for Solutions of Parametric Ordinary Differential Equations with Non-differentiable Right-Hand Sides
Generalized Derivatives for Solutions of Parametric Ordinary Differential Equations with Non-differentiable Right-Hand Sides The MIT Faculty has made this article openly available. Please share how this
More informationSensitivity Analysis for Nonsmooth Dynamic Systems. Kamil Ahmad Khan
Sensitivity Analysis for Nonsmooth Dynamic Systems by Kamil Ahmad Khan B.S.E., Chemical Engineering, Princeton University (2009) M.S. Chemical Engineering Practice, Massachusetts Institute of Technology
More informationGeneralized Derivatives of Differential Algebraic Equations
Generalized Derivatives of Differential Algebraic Equations The MIT Faculty has made this article openly available. Please share how this access benefits you. Your story matters. Citation As Published
More informationON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS
MATHEMATICS OF OPERATIONS RESEARCH Vol. 28, No. 4, November 2003, pp. 677 692 Printed in U.S.A. ON A CLASS OF NONSMOOTH COMPOSITE FUNCTIONS ALEXANDER SHAPIRO We discuss in this paper a class of nonsmooth
More informationNonsmooth analysis of connected oil well system
Nonsmooth analysis of connected oil well system Marius Reed December 9, 27 TKP458 - Chemical Engineering, Specialization Project Department of chemical engineering Norwegian University of Science and Technology
More informationNewton-type Methods for Solving the Nonsmooth Equations with Finitely Many Maximum Functions
260 Journal of Advances in Applied Mathematics, Vol. 1, No. 4, October 2016 https://dx.doi.org/10.22606/jaam.2016.14006 Newton-type Methods for Solving the Nonsmooth Equations with Finitely Many Maximum
More informationGeneralized mid-point and trapezoidal rules for Lipschitzian RHS
Generalized mid-point and trapezoidal rules for Lipschitzian RHS Richard Hasenfelder, Andreas Griewank, Tom Streubel Humboldt Universität zu Berlin 7th International Conference on Algorithmic Differentiation
More informationActive sets, steepest descent, and smooth approximation of functions
Active sets, steepest descent, and smooth approximation of functions Dmitriy Drusvyatskiy School of ORIE, Cornell University Joint work with Alex D. Ioffe (Technion), Martin Larsson (EPFL), and Adrian
More informationMultivariable Calculus
2 Multivariable Calculus 2.1 Limits and Continuity Problem 2.1.1 (Fa94) Let the function f : R n R n satisfy the following two conditions: (i) f (K ) is compact whenever K is a compact subset of R n. (ii)
More informationSimulation and Design Methods for Multiphase Multistream Heat Exchangers
Preprint, th IFAC Symposium on Dynamics and Control of Process Systems, including Biosystems Simulation and Design Methods for Multiphase Multistream Heat Exchangers Harry A. J. Watson * Paul I. Barton
More informationA Novel Inexact Smoothing Method for Second-Order Cone Complementarity Problems
A Novel Inexact Smoothing Method for Second-Order Cone Complementarity Problems Xiaoni Chi Guilin University of Electronic Technology School of Math & Comput Science Guilin Guangxi 541004 CHINA chixiaoni@126.com
More informationGradient Methods Using Momentum and Memory
Chapter 3 Gradient Methods Using Momentum and Memory The steepest descent method described in Chapter always steps in the negative gradient direction, which is orthogonal to the boundary of the level set
More informationConvex Optimization. (EE227A: UC Berkeley) Lecture 4. Suvrit Sra. (Conjugates, subdifferentials) 31 Jan, 2013
Convex Optimization (EE227A: UC Berkeley) Lecture 4 (Conjugates, subdifferentials) 31 Jan, 2013 Suvrit Sra Organizational HW1 due: 14th Feb 2013 in class. Please L A TEX your solutions (contact TA if this
More informationCoordinate Update Algorithm Short Course Subgradients and Subgradient Methods
Coordinate Update Algorithm Short Course Subgradients and Subgradient Methods Instructor: Wotao Yin (UCLA Math) Summer 2016 1 / 30 Notation f : H R { } is a closed proper convex function domf := {x R n
More informationON REGULARITY CONDITIONS FOR COMPLEMENTARITY PROBLEMS
ON REGULARITY CONDITIONS FOR COMPLEMENTARITY PROBLEMS A. F. Izmailov and A. S. Kurennoy December 011 ABSTRACT In the context of mixed complementarity problems various concepts of solution regularity are
More informationAD for PS Functions and Relations to Generalized Derivative Concepts
AD for PS Functions and Relations to Generalized Derivative Concepts Andrea Walther 1 and Andreas Griewank 2 1 Institut für Mathematik, Universität Paderborn 2 School of Math. Sciences and Information
More informationUnconstrained minimization of smooth functions
Unconstrained minimization of smooth functions We want to solve min x R N f(x), where f is convex. In this section, we will assume that f is differentiable (so its gradient exists at every point), and
More informationMath 265H: Calculus III Practice Midterm II: Fall 2014
Name: Section #: Math 65H: alculus III Practice Midterm II: Fall 14 Instructions: This exam has 7 problems. The number of points awarded for each question is indicated in the problem. Answer each question
More informationDepartment of Social Systems and Management. Discussion Paper Series
Department of Social Systems and Management Discussion Paper Series No. 1262 Complementarity Problems over Symmetric Cones: A Survey of Recent Developments in Several Aspects by Akiko YOSHISE July 2010
More informationAn Introduction to Correlation Stress Testing
An Introduction to Correlation Stress Testing Defeng Sun Department of Mathematics and Risk Management Institute National University of Singapore This is based on a joint work with GAO Yan at NUS March
More informationNewton s Method. Javier Peña Convex Optimization /36-725
Newton s Method Javier Peña Convex Optimization 10-725/36-725 1 Last time: dual correspondences Given a function f : R n R, we define its conjugate f : R n R, f ( (y) = max y T x f(x) ) x Properties and
More informationA Regularized Directional Derivative-Based Newton Method for Inverse Singular Value Problems
A Regularized Directional Derivative-Based Newton Method for Inverse Singular Value Problems Wei Ma Zheng-Jian Bai September 18, 2012 Abstract In this paper, we give a regularized directional derivative-based
More information1 Newton s Method. Suppose we want to solve: x R. At x = x, f (x) can be approximated by:
Newton s Method Suppose we want to solve: (P:) min f (x) At x = x, f (x) can be approximated by: n x R. f (x) h(x) := f ( x)+ f ( x) T (x x)+ (x x) t H ( x)(x x), 2 which is the quadratic Taylor expansion
More informationLecture 15 Newton Method and Self-Concordance. October 23, 2008
Newton Method and Self-Concordance October 23, 2008 Outline Lecture 15 Self-concordance Notion Self-concordant Functions Operations Preserving Self-concordance Properties of Self-concordant Functions Implications
More informationTechnische Universität Dresden Institut für Numerische Mathematik. An LP-Newton Method: Nonsmooth Equations, KKT Systems, and Nonisolated Solutions
Als Manuskript gedruckt Technische Universität Dresden Institut für Numerische Mathematik An LP-Newton Method: Nonsmooth Equations, KKT Systems, and Nonisolated Solutions F. Facchinei, A. Fischer, and
More informationsystem of equations. In particular, we give a complete characterization of the Q-superlinear
INEXACT NEWTON METHODS FOR SEMISMOOTH EQUATIONS WITH APPLICATIONS TO VARIATIONAL INEQUALITY PROBLEMS Francisco Facchinei 1, Andreas Fischer 2 and Christian Kanzow 3 1 Dipartimento di Informatica e Sistemistica
More informationProximal Newton Method. Ryan Tibshirani Convex Optimization /36-725
Proximal Newton Method Ryan Tibshirani Convex Optimization 10-725/36-725 1 Last time: primal-dual interior-point method Given the problem min x subject to f(x) h i (x) 0, i = 1,... m Ax = b where f, h
More informationSymmetric Matrices and Eigendecomposition
Symmetric Matrices and Eigendecomposition Robert M. Freund January, 2014 c 2014 Massachusetts Institute of Technology. All rights reserved. 1 2 1 Symmetric Matrices and Convexity of Quadratic Functions
More informationChapter 2 Convex Analysis
Chapter 2 Convex Analysis The theory of nonsmooth analysis is based on convex analysis. Thus, we start this chapter by giving basic concepts and results of convexity (for further readings see also [202,
More informationSecond Order ODEs. CSCC51H- Numerical Approx, Int and ODEs p.130/177
Second Order ODEs Often physical or biological systems are best described by second or higher-order ODEs. That is, second or higher order derivatives appear in the mathematical model of the system. For
More information1. Gradient method. gradient method, first-order methods. quadratic bounds on convex functions. analysis of gradient method
L. Vandenberghe EE236C (Spring 2016) 1. Gradient method gradient method, first-order methods quadratic bounds on convex functions analysis of gradient method 1-1 Approximate course outline First-order
More information1 Directional Derivatives and Differentiability
Wednesday, January 18, 2012 1 Directional Derivatives and Differentiability Let E R N, let f : E R and let x 0 E. Given a direction v R N, let L be the line through x 0 in the direction v, that is, L :=
More informationSpectral gradient projection method for solving nonlinear monotone equations
Journal of Computational and Applied Mathematics 196 (2006) 478 484 www.elsevier.com/locate/cam Spectral gradient projection method for solving nonlinear monotone equations Li Zhang, Weijun Zhou Department
More informationConditional Gradient (Frank-Wolfe) Method
Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties
More informationThis can be 2 lectures! still need: Examples: non-convex problems applications for matrix factorization
This can be 2 lectures! still need: Examples: non-convex problems applications for matrix factorization x = prox_f(x)+prox_{f^*}(x) use to get prox of norms! PROXIMAL METHODS WHY PROXIMAL METHODS Smooth
More informationA CHARACTERIZATION OF STRICT LOCAL MINIMIZERS OF ORDER ONE FOR STATIC MINMAX PROBLEMS IN THE PARAMETRIC CONSTRAINT CASE
Journal of Applied Analysis Vol. 6, No. 1 (2000), pp. 139 148 A CHARACTERIZATION OF STRICT LOCAL MINIMIZERS OF ORDER ONE FOR STATIC MINMAX PROBLEMS IN THE PARAMETRIC CONSTRAINT CASE A. W. A. TAHA Received
More informationA Continuation Method for the Solution of Monotone Variational Inequality Problems
A Continuation Method for the Solution of Monotone Variational Inequality Problems Christian Kanzow Institute of Applied Mathematics University of Hamburg Bundesstrasse 55 D 20146 Hamburg Germany e-mail:
More informationSEMISMOOTH LEAST SQUARES METHODS FOR COMPLEMENTARITY PROBLEMS
SEMISMOOTH LEAST SQUARES METHODS FOR COMPLEMENTARITY PROBLEMS Dissertation zur Erlangung des naturwissenschaftlichen Doktorgrades der Bayerischen Julius Maximilians Universität Würzburg vorgelegt von STEFANIA
More informationWhat can be expressed via Conic Quadratic and Semidefinite Programming?
What can be expressed via Conic Quadratic and Semidefinite Programming? A. Nemirovski Faculty of Industrial Engineering and Management Technion Israel Institute of Technology Abstract Tremendous recent
More informationSome Properties of the Augmented Lagrangian in Cone Constrained Optimization
MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented
More informationDUALIZATION OF SUBGRADIENT CONDITIONS FOR OPTIMALITY
DUALIZATION OF SUBGRADIENT CONDITIONS FOR OPTIMALITY R. T. Rockafellar* Abstract. A basic relationship is derived between generalized subgradients of a given function, possibly nonsmooth and nonconvex,
More information1.2 Derivation. d p f = d p f(x(p)) = x fd p x (= f x x p ). (1) Second, g x x p + g p = 0. d p f = f x g 1. The expression f x gx
PDE-constrained optimization and the adjoint method Andrew M. Bradley November 16, 21 PDE-constrained optimization and the adjoint method for solving these and related problems appear in a wide range of
More informationLecture 19 Algorithms for VIs KKT Conditions-based Ideas. November 16, 2008
Lecture 19 Algorithms for VIs KKT Conditions-based Ideas November 16, 2008 Outline for solution of VIs Algorithms for general VIs Two basic approaches: First approach reformulates (and solves) the KKT
More informationAffine covariant Semi-smooth Newton in function space
Affine covariant Semi-smooth Newton in function space Anton Schiela March 14, 2018 These are lecture notes of my talks given for the Winter School Modern Methods in Nonsmooth Optimization that was held
More information10-725/36-725: Convex Optimization Prerequisite Topics
10-725/36-725: Convex Optimization Prerequisite Topics February 3, 2015 This is meant to be a brief, informal refresher of some topics that will form building blocks in this course. The content of the
More informationTechnische Universität Dresden Herausgeber: Der Rektor
Als Manuskript gedruckt Technische Universität Dresden Herausgeber: Der Rektor The Gradient of the Squared Residual as Error Bound an Application to Karush-Kuhn-Tucker Systems Andreas Fischer MATH-NM-13-2002
More informationSubgradients. subgradients and quasigradients. subgradient calculus. optimality conditions via subgradients. directional derivatives
Subgradients subgradients and quasigradients subgradient calculus optimality conditions via subgradients directional derivatives Prof. S. Boyd, EE392o, Stanford University Basic inequality recall basic
More informationConvex Optimization. Ofer Meshi. Lecture 6: Lower Bounds Constrained Optimization
Convex Optimization Ofer Meshi Lecture 6: Lower Bounds Constrained Optimization Lower Bounds Some upper bounds: #iter μ 2 M #iter 2 M #iter L L μ 2 Oracle/ops GD κ log 1/ε M x # ε L # x # L # ε # με f
More informationDescent methods. min x. f(x)
Gradient Descent Descent methods min x f(x) 5 / 34 Descent methods min x f(x) x k x k+1... x f(x ) = 0 5 / 34 Gradient methods Unconstrained optimization min f(x) x R n. 6 / 34 Gradient methods Unconstrained
More informationA derivative-free nonmonotone line search and its application to the spectral residual method
IMA Journal of Numerical Analysis (2009) 29, 814 825 doi:10.1093/imanum/drn019 Advance Access publication on November 14, 2008 A derivative-free nonmonotone line search and its application to the spectral
More informationBulletin of the. Iranian Mathematical Society
ISSN: 1017-060X (Print) ISSN: 1735-8515 (Online) Bulletin of the Iranian Mathematical Society Vol. 41 (2015), No. 5, pp. 1259 1269. Title: A uniform approximation method to solve absolute value equation
More informationSOLUTIONS TO THE FINAL EXAM. December 14, 2010, 9:00am-12:00 (3 hours)
SOLUTIONS TO THE 18.02 FINAL EXAM BJORN POONEN December 14, 2010, 9:00am-12:00 (3 hours) 1) For each of (a)-(e) below: If the statement is true, write TRUE. If the statement is false, write FALSE. (Please
More informationLecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem
Lecture 3: Lagrangian duality and algorithms for the Lagrangian dual problem Michael Patriksson 0-0 The Relaxation Theorem 1 Problem: find f := infimum f(x), x subject to x S, (1a) (1b) where f : R n R
More informationSEMI-SMOOTH SECOND-ORDER TYPE METHODS FOR COMPOSITE CONVEX PROGRAMS
SEMI-SMOOTH SECOND-ORDER TYPE METHODS FOR COMPOSITE CONVEX PROGRAMS XIANTAO XIAO, YONGFENG LI, ZAIWEN WEN, AND LIWEI ZHANG Abstract. The goal of this paper is to study approaches to bridge the gap between
More informationProximal Gradient Descent and Acceleration. Ryan Tibshirani Convex Optimization /36-725
Proximal Gradient Descent and Acceleration Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: subgradient method Consider the problem min f(x) with f convex, and dom(f) = R n. Subgradient method:
More informationLecture 13 Newton-type Methods A Newton Method for VIs. October 20, 2008
Lecture 13 Newton-type Methods A Newton Method for VIs October 20, 2008 Outline Quick recap of Newton methods for composite functions Josephy-Newton methods for VIs A special case: mixed complementarity
More informationOn Second-order Properties of the Moreau-Yosida Regularization for Constrained Nonsmooth Convex Programs
On Second-order Properties of the Moreau-Yosida Regularization for Constrained Nonsmooth Convex Programs Fanwen Meng,2 ; Gongyun Zhao Department of Mathematics National University of Singapore 2 Science
More informationVector Calculus. A primer
Vector Calculus A primer Functions of Several Variables A single function of several variables: f: R $ R, f x (, x ),, x $ = y. Partial derivative vector, or gradient, is a vector: f = y,, y x ( x $ Multi-Valued
More informationNonlinear equations and optimization
Notes for 2017-03-29 Nonlinear equations and optimization For the next month or so, we will be discussing methods for solving nonlinear systems of equations and multivariate optimization problems. We will
More informationSection 3.5 The Implicit Function Theorem
Section 3.5 The Implicit Function Theorem THEOREM 11 (Special Implicit Function Theorem): Suppose that F : R n+1 R has continuous partial derivatives. Denoting points in R n+1 by (x, z), where x R n and
More informationLecture 13 - Wednesday April 29th
Lecture 13 - Wednesday April 29th jacques@ucsdedu Key words: Systems of equations, Implicit differentiation Know how to do implicit differentiation, how to use implicit and inverse function theorems 131
More informationMATH The Chain Rule Fall 2016 A vector function of a vector variable is a function F: R n R m. In practice, if x 1, x n is the input,
MATH 20550 The Chain Rule Fall 2016 A vector function of a vector variable is a function F: R n R m. In practice, if x 1, x n is the input, F(x 1,, x n ) F 1 (x 1,, x n ),, F m (x 1,, x n ) where each
More informationConvex Optimization Theory. Chapter 5 Exercises and Solutions: Extended Version
Convex Optimization Theory Chapter 5 Exercises and Solutions: Extended Version Dimitri P. Bertsekas Massachusetts Institute of Technology Athena Scientific, Belmont, Massachusetts http://www.athenasc.com
More informationGradient Descent. Ryan Tibshirani Convex Optimization /36-725
Gradient Descent Ryan Tibshirani Convex Optimization 10-725/36-725 Last time: canonical convex programs Linear program (LP): takes the form min x subject to c T x Gx h Ax = b Quadratic program (QP): like
More information6. Proximal gradient method
L. Vandenberghe EE236C (Spring 2016) 6. Proximal gradient method motivation proximal mapping proximal gradient method with fixed step size proximal gradient method with line search 6-1 Proximal mapping
More informationPh.D. Katarína Bellová Page 1 Mathematics 2 (10-PHY-BIPMA2) EXAM - Solutions, 20 July 2017, 10:00 12:00 All answers to be justified.
PhD Katarína Bellová Page 1 Mathematics 2 (10-PHY-BIPMA2 EXAM - Solutions, 20 July 2017, 10:00 12:00 All answers to be justified Problem 1 [ points]: For which parameters λ R does the following system
More informationLectures 9-10: Polynomial and piecewise polynomial interpolation
Lectures 9-1: Polynomial and piecewise polynomial interpolation Let f be a function, which is only known at the nodes x 1, x,, x n, ie, all we know about the function f are its values y j = f(x j ), j
More informationLecture 11: Arclength and Line Integrals
Lecture 11: Arclength and Line Integrals Rafikul Alam Department of Mathematics IIT Guwahati Parametric curves Definition: A continuous mapping γ : [a, b] R n is called a parametric curve or a parametrized
More informationUnconstrained optimization
Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout
More informationNonlinear Systems and Control Lecture # 12 Converse Lyapunov Functions & Time Varying Systems. p. 1/1
Nonlinear Systems and Control Lecture # 12 Converse Lyapunov Functions & Time Varying Systems p. 1/1 p. 2/1 Converse Lyapunov Theorem Exponential Stability Let x = 0 be an exponentially stable equilibrium
More informationMATH 411 NOTES (UNDER CONSTRUCTION)
MATH 411 NOTES (NDE CONSTCTION 1. Notes on compact sets. This is similar to ideas you learned in Math 410, except open sets had not yet been defined. Definition 1.1. K n is compact if for every covering
More informationErrata for Vector and Geometric Calculus Printings 1-4
October 21, 2017 Errata for Vector and Geometric Calculus Printings 1-4 Note: p. m (n) refers to page m of Printing 4 and page n of Printings 1-3. p. 31 (29), just before Theorem 3.10. f x(h) = [f x][h]
More informationMath 212-Lecture 8. The chain rule with one independent variable
Math 212-Lecture 8 137: The multivariable chain rule The chain rule with one independent variable w = f(x, y) If the particle is moving along a curve x = x(t), y = y(t), then the values that the particle
More informationHalf of Final Exam Name: Practice Problems October 28, 2014
Math 54. Treibergs Half of Final Exam Name: Practice Problems October 28, 24 Half of the final will be over material since the last midterm exam, such as the practice problems given here. The other half
More informationPartial Derivatives. w = f(x, y, z).
Partial Derivatives 1 Functions of Several Variables So far we have focused our attention of functions of one variable. These functions model situations in which a variable depends on another independent
More informationChapter 3. Differentiable Mappings. 1. Differentiable Mappings
Chapter 3 Differentiable Mappings 1 Differentiable Mappings Let V and W be two linear spaces over IR A mapping L from V to W is called a linear mapping if L(u + v) = Lu + Lv for all u, v V and L(λv) =
More informationDerivatives and Integrals
Derivatives and Integrals Definition 1: Derivative Formulas d dx (c) = 0 d dx (f ± g) = f ± g d dx (kx) = k d dx (xn ) = nx n 1 (f g) = f g + fg ( ) f = f g fg g g 2 (f(g(x))) = f (g(x)) g (x) d dx (ax
More informationMath 234 Final Exam (with answers) Spring 2017
Math 234 Final Exam (with answers) pring 217 1. onsider the points A = (1, 2, 3), B = (1, 2, 2), and = (2, 1, 4). (a) [6 points] Find the area of the triangle formed by A, B, and. olution: One way to solve
More informationEE 546, Univ of Washington, Spring Proximal mapping. introduction. review of conjugate functions. proximal mapping. Proximal mapping 6 1
EE 546, Univ of Washington, Spring 2012 6. Proximal mapping introduction review of conjugate functions proximal mapping Proximal mapping 6 1 Proximal mapping the proximal mapping (prox-operator) of a convex
More informationDuality and dynamics in Hamilton-Jacobi theory for fully convex problems of control
Duality and dynamics in Hamilton-Jacobi theory for fully convex problems of control RTyrrell Rockafellar and Peter R Wolenski Abstract This paper describes some recent results in Hamilton- Jacobi theory
More informationNonlinear equations. Norms for R n. Convergence orders for iterative methods
Nonlinear equations Norms for R n Assume that X is a vector space. A norm is a mapping X R with x such that for all x, y X, α R x = = x = αx = α x x + y x + y We define the following norms on the vector
More informationLecture 14: Newton s Method
10-725/36-725: Conve Optimization Fall 2016 Lecturer: Javier Pena Lecture 14: Newton s ethod Scribes: Varun Joshi, Xuan Li Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes
More information10. Unconstrained minimization
Convex Optimization Boyd & Vandenberghe 10. Unconstrained minimization terminology and assumptions gradient descent method steepest descent method Newton s method self-concordant functions implementation
More informationChain Rule. MATH 311, Calculus III. J. Robert Buchanan. Spring Department of Mathematics
3.33pt Chain Rule MATH 311, Calculus III J. Robert Buchanan Department of Mathematics Spring 2019 Single Variable Chain Rule Suppose y = g(x) and z = f (y) then dz dx = d (f (g(x))) dx = f (g(x))g (x)
More informationSubgradients. subgradients. strong and weak subgradient calculus. optimality conditions via subgradients. directional derivatives
Subgradients subgradients strong and weak subgradient calculus optimality conditions via subgradients directional derivatives Prof. S. Boyd, EE364b, Stanford University Basic inequality recall basic inequality
More informationMA102: Multivariable Calculus
MA102: Multivariable Calculus Rupam Barman and Shreemayee Bora Department of Mathematics IIT Guwahati Differentiability of f : U R n R m Definition: Let U R n be open. Then f : U R n R m is differentiable
More informationChapter 4. Inverse Function Theorem. 4.1 The Inverse Function Theorem
Chapter 4 Inverse Function Theorem d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d d dd d d d d This chapter
More informationMarch 8, 2010 MATH 408 FINAL EXAM SAMPLE
March 8, 200 MATH 408 FINAL EXAM SAMPLE EXAM OUTLINE The final exam for this course takes place in the regular course classroom (MEB 238) on Monday, March 2, 8:30-0:20 am. You may bring two-sided 8 page
More informationLecture II: Vector and Multivariate Calculus
Lecture II: Vector and Multivariate Calculus Dot Product a, b R ' ', a ( b = +,- a + ( b + R. a ( b = a b cos θ. θ convex angle between the vectors. Squared norm of vector: a 3 = a ( a. Alternative notation:
More informationSIAM Conference on Imaging Science, Bologna, Italy, Adaptive FISTA. Peter Ochs Saarland University
SIAM Conference on Imaging Science, Bologna, Italy, 2018 Adaptive FISTA Peter Ochs Saarland University 07.06.2018 joint work with Thomas Pock, TU Graz, Austria c 2018 Peter Ochs Adaptive FISTA 1 / 16 Some
More informationNonnegative Inverse Eigenvalue Problems with Partial Eigendata
Nonnegative Inverse Eigenvalue Problems with Partial Eigendata Zheng-Jian Bai Stefano Serra-Capizzano Zhi Zhao June 25, 2011 Abstract In this paper we consider the inverse problem of constructing an n-by-n
More informationLMI Methods in Optimal and Robust Control
LMI Methods in Optimal and Robust Control Matthew M. Peet Arizona State University Lecture 15: Nonlinear Systems and Lyapunov Functions Overview Our next goal is to extend LMI s and optimization to nonlinear
More informationIntroduction. New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems
New Nonsmooth Trust Region Method for Unconstraint Locally Lipschitz Optimization Problems Z. Akbari 1, R. Yousefpour 2, M. R. Peyghami 3 1 Department of Mathematics, K.N. Toosi University of Technology,
More informationLocal strong convexity and local Lipschitz continuity of the gradient of convex functions
Local strong convexity and local Lipschitz continuity of the gradient of convex functions R. Goebel and R.T. Rockafellar May 23, 2007 Abstract. Given a pair of convex conjugate functions f and f, we investigate
More informationA Smoothing Newton Method for Solving Absolute Value Equations
A Smoothing Newton Method for Solving Absolute Value Equations Xiaoqin Jiang Department of public basic, Wuhan Yangtze Business University, Wuhan 430065, P.R. China 392875220@qq.com Abstract: In this paper,
More informationFrank-Wolfe Method. Ryan Tibshirani Convex Optimization
Frank-Wolfe Method Ryan Tibshirani Convex Optimization 10-725 Last time: ADMM For the problem min x,z f(x) + g(z) subject to Ax + Bz = c we form augmented Lagrangian (scaled form): L ρ (x, z, w) = f(x)
More informationKey words. saddle-point dynamics, asymptotic convergence, convex-concave functions, proximal calculus, center manifold theory, nonsmooth dynamics
SADDLE-POINT DYNAMICS: CONDITIONS FOR ASYMPTOTIC STABILITY OF SADDLE POINTS ASHISH CHERUKURI, BAHMAN GHARESIFARD, AND JORGE CORTÉS Abstract. This paper considers continuously differentiable functions of
More information1 The Observability Canonical Form
NONLINEAR OBSERVERS AND SEPARATION PRINCIPLE 1 The Observability Canonical Form In this Chapter we discuss the design of observers for nonlinear systems modelled by equations of the form ẋ = f(x, u) (1)
More informationComputing Neural Network Gradients
Computing Neural Network Gradients Kevin Clark 1 Introduction The purpose of these notes is to demonstrate how to quickly compute neural network gradients in a completely vectorized way. It is complementary
More information