Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019

CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be Patricia TOSSINGS Institut de Mathématique (B37), 0/57 Phone number: 04/366.93.73 Email: Patricia.Tossings@uliege.be

TABLE OF CONTENTS 01 - Optimization in Engineering Design 02 - Fundamentals of Structural Optimization 03 - Introduction to Mathematical Programming 04 - Algorithms for Unconstrained Optimization: Gradient Methods (including conjugate directions) 05 - Line Search Techniques 06 - Algorithms for Unconstrained Optimization: Newton and quasi-newton Methods 07 - Quasi-Unconstrained Optimization 08 - General Constrained Optimization: Dual Methods 09 - General Constrained Optimization: Transformation Methods (including SLP and SQP) 10 - Optimality Criteria 11 - Structural Approximations 12 - CONLIN and MMA 13 - Sensitivity Analysis for Finite Element Model 14 - Introduction to Shape Optimization 15 - Introduction to Topology Optimization

Chapter 03 INTRODUCTION TO MATHEMATICAL PROGRAMMING THEORY 1

Contents of chapter 03 Motivation and standard formulation of MPT 3 Feasible point, feasible domain 6 Global (strict) optimum 8 Local optimum 9 Looking for optimality conditions 10 Optimality conditions for unconstrained problems 14 Optimality conditions for problems with one equality constraint 17 Optimality conditions for problems with m equality constraints 20 Optimality conditions for problems with m inequality constraints 21 In practice 27 Linear sets 28 Linear functions 29 Convex sets and functions 30 MPT: partial classification 34 In practice - Taylor expansion 37 A few words about the topology of R n 38 In practice - Efficiency of an algorithm 41 Complements about the topology of R n 46 Important results with regard to optimization (Weierstrass) 50 2

MOTIVATION Concrete problem Modelling Mathematical Optimization Problem (P) Find x, solution of problem (P): Minimize f (x) subject to g j (x) 0 ( j = 1,...,m) x S R n Mathematical Programming Theory (MPT) Convention. In this course, all the functions are assumed to be real valued. 3

Vocabulary and notations In problem (P)... f is the objective or cost function. The conditions g j (x) 0 ( j = 1,...,m) x S R n are the constraints. The set S is often an interval of R n. In that case, one speaks about side constraints. x R n x = x 1. x n = ( x 1,...,x n ) T, x i R (i = 1,...,n). 4

The formulation of problem (P) is not restrictive. With regard to the objective function: Maximize f (x) Minimize [ f (x)] { Same optimal points Opposite optimal values An interesting alternative Maximize f (x) Minimize 1 f (x) With regard to the constraints: g j (x) 0 g j (x) 0 g j (x) = 0 g j (x) 0 g j (x) 0 5

Feasible point, feasible domain x R n is a solution or feasible point of problem (P) iff g j (x ) 0 ( j = 1,...,m) and x S. The feasible domain of problem (P) is the set of all its feasible points. 6

Interior, boundary and exterior point Assume that S = R n. x is an interior point of the feasible domain of problem (P) iff g j (x) < 0, j {1,...,m}. x is a boundary point of this domain iff j {1,...,m} : g j (x) = 0 x is an exterior point of this domain iff j {1,...,m} : g j (x) > 0 Note. The points IP, BP and EP on the previous figure illustrate these notions with S different from the whole space. Inactive, active an violated constraint The constraint g j is inactive at the point x iff g j (x) < 0. It is active at x iff g j (x) = 0. It is violated at x iff g j (x) > 0. 7

Global (strict) optimum x is an optimal solution or a global optimum (global minimizer) of problem (P) iff x is feasible, f (x ) f (y) for any feasible point y of(p). x is said strict (or strong) if the strict inequality holds. 8

Local optimum x is a local optimum of problem (P) iff it admits a neighbourhood V (x ) such that x is a global optimum of the local problem Minimize f (x) (P loc ) g j (x) 0 ( j = 1,...,m) subject to x S V (x ) 9

Looking for optimality conditions - A first example in R 2 f (x 1,x 2 ) = x 2 1 + x2 2 f (0,0) = 0 and f (x 1,x 2 ) > 0, ( x1 x 2 ) ( 0 0 ) f admits a global strict optimum at ( 0 0 ) Extract from the notes of Pr. E. Delhez, Analyse mathematique 10

An interesting comment f C 2 (R 2) and f x 1 (x 1,x 2 ) = 2x 1, 2 f (x 1,x 2 ) = 2, x 2 1 f x 2 (x 1,x 2 ) = 2x 2, 2 f (x 1,x 2 ) = 2, x 2 2 2 f x 1 x 2 (x 1,x 2 ) = 2 f x 2 x 1 (x 1,x 2 ) = 0 f (x 1,x 2 ) = 0 iff x 1 = x 2 = 0 ( ) 2 2 0 f (x 1,x 2 ) = is positive de f inite 0 2 }{{} See slide 15 Notations. f denotes the gradient of f and 2 f denotes its Hessian. 11

Looking for optimality conditions - A second example in R 2 f (x 1,x 2 ) = x 2 2 x2 1 f (x 1,0) = x 2 1 0, x 1 R, f (0,0) = 0 and f (0,x 2 ) = x 2 2 0, x 2 R. At the origin, f admits a { maximum with regard to x1, minimum with regard to x 2. One says that f admits a saddle point at the origin. Extract from the notes of Pr. E. Delhez, Analyse mathematique 12

Optimality conditions - (1) UNCONSTRAINED problems (P unc ) Minimize f (x) x R n NECESSARY optimality conditions Let x be a local optimum of problem (P unc ). If f is continuously differentiable in a neighbourhood of x [ f is C 1 at x ], then x is a stationary point of f f (x ) = 0. If f is twice continuously differentiable in a neighbourhood of x [ f is C 2 at x ], then 2 f (x ) is positive semi definite. 14

Reminder of Algebra Definition A matrix A R nn is positive semi-definite (respectively negative semi-definite) iff x T Ax 0 (resp. x T Ax 0), x R n. It is positive definite (respectively negative definite) if, moreover, x T Ax = 0 x = 0. A matrix A R nn is not definite if it is neither positive semi-definite nor negative semi-definite. Proposition A symmetric matrix A R nn is - positive semi-definite iff all its eigenvalues are positive ( 0), - positive definite iff all its eigenvalues are strictly positive (> 0), - negative semi-definite iff all its eigenvalues are negative ( 0), - negative definite iff all its eigenvalues are strictly negative (< 0), - not definite if it admits both strictly positive and strictly negative eigenvalues. (See also the Sylvester criterion.) 15

Optimality conditions - (2) UNCONSTRAINED problems (P unc ) Minimize f (x) x R n SUFFICIENT optimality conditions Assume that - f is C 2 at x, - f (x ) = 0, - 2 f (x ) is positive definite. Then x is a strict local optimum of (P unc ). Relaxation. If 2 f is positive semi-definite in a neighbourhood of x, then x is a local optimum of (P unc ). Characterization of a global optimum??... Except if f is convex (See slides 30-32) 16

Optimality conditions - (3) Problems with ONE EQUALITY CONSTRAINT ( ) Minimize f (x) Peqc subject to g(x) = 0 NECESSARY optimality conditions Assume that - f and g are C 1 at x, - g(x ) 0, - x ( ) is an optimal solution of P eqc. There is a real number λ such that (x,λ ) is a stationary point of the Lagrangian function L(x,λ) = f (x) + λg(x), x R n, λ R. Vocabulary. λ is called Lagrange multiplier. 17

An intuitive explanation of the previous result (x,λ ) is a stationary point of the Lagrangian function iff f (x ) + λ g(x ) = 0 and g(x ) = 0. The second condition means that x satisfies the constraint. The first one implies that f (x ) and g(x ) are parallel. If this condition is not satisfied, there exists a direction e such that while D e f (x ) = e T f (x ) > 0 D e f (x ) = e T f (x ) < 0 and f can not be extremal at x. Extract from the notes of Pr. E. Delhez, Analyse mathematique 18

Reminder of Mathematical Analysis Let e R n be a direction: e = e 1. e n satisfies e = n e 2 i i=1 }{{} Norm: see slide 38 = 1. Definition The directional derivative of f in the direction e at the point x R n, denoted D e f (x), is the number defined by D e f (x) = lim θ 0 + f (x + θe) f (x) θ if the limit in the right hand side exists and is finite. Proposition If D e f (x) exists, then, starting from x, the function f increases (respectively decreases) in the direction of e iff D e f (x) > 0 (resp. D e f (x) < 0). Proposition If f is C 1 at x, then D e f (x) = e T f (x) }{{}. Inner product: see slide 38 Proposition If f is C 1 at x, then, starting from x, - the greatest increase of f is obtained in the direction of f (x), - its greatest decrease is obtained in the direction of f (x), - the variation of f is null in any direction e orthogonal to f (x). }{{} e T f (x)=0 The last result will be useful in Chapter 04. 19

Optimality conditions - (4) Problems with m EQUALITY CONSTRAINTS ( ) Minimize f (x) Peqc subject to g j (x) = 0 ( j = 1,...,m) NECESSARY optimality conditions Assume that - f and g j ( j = 1,...,m) are C 1 at x, - the gradients of the constraints are linearly independent at x, - x is an optimal solution of ( P eqc ). There are real numbers λ 1,...,λ m such that ( x,λ 1,...,λ m) is a stationary point of the Lagrangian function L(x,λ 1,...,λ m ) = f (x) + m λ j g j (x), x R n, λ j R. j=1 Convention. When there is no risk of confusion, we set λ = (λ 1,...,λ m ). 20

Optimality conditions - (5) Problems with m INEQUALITY CONSTRAINTS ( ) Minimize f (x) Pineqc subject to g j (x) 0 ( j = 1,...,m) We associate to ( P ineqc ) the Lagrangian function L(x,λ) = f (x) + m j=1 λ j g j (x), x R n, λ }{{ 0 }. Attention! and the quasi-unconstrained problem (see slide 34) ( ) PLag min max L(x,λ) x λ 0 Convention. λ 0 is written for λ j 0, j = 1,...,m. ( ) ( ) Remark. Pineqc PLag (see chapter 08). ( ) Interpretation. With regard to problem P Lag, the second term of the Lagrangian function can be seen as a penalization for unfeasible points (see chapter 09). 21

NECESSARY optimality conditions for ( P ineqc ) Assume that - f and g j ( j = 1,...,m) are C 1 at x, Karush-Kuhn-Tucker (in brief: KKT) - x is a regular point, i.e. the gradients of the active constraints at x are linearly independent, - x is an optimal solution of ( P ineqc ). There is a set of Lagrange multipliers λ 1,...,λ m such that and f (x ) + g j (x ) 0 λ j 0 m j=1 λ j g j (x ) = 0 ( j = 1,...,m) λ j g j (x ) = 0 Remark. The last condition implies that the Lagrange multipliers corresponding to inactive constraints are zero (complementary slackness). 22

KKT - ILLUSTRATION / DISCUSSION Objective function : f (x 1,x 2 ) = x2 1 + x2 2 2 Constraints : g 1 (x 1,x 2 ) = 1 x 1 0 g 2 (x 1,x 2 ) = x 2 1 4x 1 + x 2 0 The optimal value of f over the whole space is 0 and is achieved for x 1 = x 2 = 0. The isovalue or level curves of f are circles centered at the origin. On the circle with radius r, the value of f is r2 2. 23

The optimal value of f over the feasible domain is achieved at ( ) [ x 1 = f (x ) = 1 ] 0 2 g 1 (x ) = 0 : active constraint, g 2 (x ) = 3 < 0 : inactive constraint. g 1 (x) 0, x R 2. In particular, x is a regular point. 24

Lagrangian function L(x,λ 1,λ 2 ) = x2 1 + x2 2 2 + λ 1 (1 x 1 ) + λ 2 ( x 2 1 4x 1 + x 2 ) L x 1 (x,λ 1,λ 2 ) = x 1 λ 1 + (2x 1 4)λ 2 L x 2 (x,λ 1,λ 2 ) = x 2 + λ 2 x L(x,λ 1,λ 2 ) = 0 { 1 λ1 2λ 2 = 0 λ 2 = 0 { λ1 = 1 λ 2 = 0 λ 1 0 and λ 2 0. The positivity of the Lagrange multipliers is satisfied. λ 1 g 1 (x ) = 0 and λ 2 g 2 (x ) = 0. The complementary slackness is also satisfied. 25

( 2 Let us now consider the point x = 0 ) x is a feasible point. g 1 and g 2 are both inactive at x. x L(x,λ 1,λ 2 ) = 0 { 2 λ1 = 0 λ 2 = 0 { λ1 = 2 λ 2 = 0 λ 1 0 and λ 2 0, λ 2 g 2 (x) = 0 BUT λ 1 g 1 (x) 0. The positivity of the Lagrange multipliers is satisfied BUT the complementary slackness is not. Due to the necessary optimality conditions of KKT, x can not be an optimal point of f over the feasible domain. x is effectively not such a point since f (x) = 2 > 1 2 = f (x ). 26

IN PRACTICE f is given but its values may be hard to compute. f is not necessarily differentiable and, even if it is sufficiently, f and 2 f may be difficult to approximate. Most of the algorithms introduced to solve problem (P) are iterative. Challenge: To obtain globally convergent methods (See slide 41) Response: Adaptation to some classes of problems (See slides 34-36) An interesting approach: Global phase + Local one Measure of efficiency - Number of function (and derivatives) evaluations required - Number of arithmetic operations required - Storage requirements - Order (or rate) of convergence (See slides 42-44) 27

LINEAR SETS A set L R n is linear iff x,y L, λ R (x + y) L and (λx) L. Generalization: x k L, λ k R (k = 1,...,K) [ K k=1λ k x k ] }{{} Linear combination o f the x k L. An affine set results from the translation of a linear one A = a + L, a R n, L linear set of R n. 28

LINEAR FUNCTIONS Let L be a linear set of R n. A function f : L R is linear iff ( K ) f λ k x k K = λ k f (x k) k=1 k=1 for any linear combination of elements in L. Consequence: f (x) = (a x) = a T x = n i=1 a i x i, a R n. An affine function results from the addition of a linear function with a constant f (x) = (a x) + b, a R n, b R. 29

CONVEX SETS AND FUNCTIONS A set C R n is convex iff x,y C, ϑ [0,1] [ϑx + (1 ϑ)y] C. Generalization: x k C, ϑ k [0,1] (k = 1,...,K), K k=1 ϑ k = 1 [ K k=1ϑ k x k ] }{{} Convex combination o f the x k C. Let C be a convex set of R n. A function f : C R is convex iff f [ϑx + (1 ϑ)y] ϑ f (x)+(1 ϑ) f (y), x,y C, ϑ [0,1]. It is concave if the inequality is reversed. f is strictly convex (respectively strictly concave) if the strict (appropriate) inequality holds whenever x y and ϑ ]0,1[. A function f defined on a convex set C is (strictly) concave iff its opposite is (strictly) convex. 30

Some important properties of convex functions - (1) A strictly convex function f : R n R admits at most one minimizer. A C 1 function f : R n R is convex iff f (y) f (x) + [ f (x)] T (y x), x,y R n. It is strictly convex iff the strict inequality holds whenever x y. A C 2 function f : R n R is convex iff semi-definite for any x R n. 2 f (x) is positive It is strictly convex iff 2 f (x) is positive definite for any x R n. Example: a quadratic function f (x) = 1 2 xt A x + b T x + c, A R nn (symmetric), b R n, c R is convex (resp. strictly convex) iff the matrix A is positive semi-definite (resp. positive definite). 31

Some important properties of convex functions - (2) If f : R n R is a C 1 convex function, then f (y) f (x) + [ f (x)] T (y x), x,y R n where f denotes the gradient of f. Any stationary point of a C 1 convex function f : R n R is a global minimizer of this function. In other words, x is a global minimizer of a convex function f (x ) = 0. f : R n R iff As a consequence, a C 1 strictly convex function f : R n R admits at most one stationary point. 32

Subdifferential of a convex function If f : R n R is a C 1 convex function, then f (y) f (x) + [ f (x)] T (y x), x,y R n where f denotes the gradient of f. Generalization x is a subgradient of a convex function f : R n R at a point x R n iff f (y) f (x) + x T (y x), y R n The subdifferential of f at x [denoted f (x)] is the set of all its subgradients at this point. If f is C 1 at x then f (x) = { f (x)}. x is a global minimizer of f iff 0 f (x ). 33

MPT: partial classification - (1) Preliminary convention f is continuous on a continuous set f is not necessarily differentiable We don t consider integer (or discrete) programming. Unconstrained problem: m = 0 and S = R n Basis of everything! Many general methods solve a sequence of unconstrained problems. Quasi-unconstrained problem: m = 0 and S is an interval, i.e. the minimization is only subject to side constraints x i x i x i (i = 1,...,n) Special case of linearly constrained problems (see next slide). Straightforward adaptation of unconstrained optimization methods. Useful for dual problems (see chapter 08). 34

MPT: partial classification - (2) Linear problem: f and g j linear, S interval of R n Well documented ( standard packages). Some general methods solve a sequence of linear problems (SLP). Linearly constrained problem: { f nonlinear g j linear, S interval of R n Easy adaptation of unconstrained optimization techniques. Some general methods solve a sequence of linearly constrained problems (see structural optimization). An interesting particular case: f quadratic f (x) = 1 2 xt A x + b T x + c with A (symmetric) positive (semi )definite. Special case of convex programming. Reference problem, specially in the unconstrained case (conjugacy, convergence properties, etc). Some general methods solve a sequence of quadratic problems (SQP). 35

MPT: partial classification - (3) Convex problem: f and g j convex, S convex subset of R n Global solution (see slide 32). KKT = sufficient conditions if the Slater condition is satisfied. Duality is rigorous. (See chapter 08) Separable problem: f (x) = n f i (x i ), g j (x) = i=1 and S is an interval defined by side constraints n g ji (x i ) i=1 x i x i x i (i = 1,...,n) Simplifications! The problem is equivalent to n one-dimensional subproblems. 2 f and 2 g j are diagonal matrices. Possibility to use parallelism. 36

IN PRACTICE Use the Taylor expansion with appropriate approximations of f and 2 f Assume that - f is sufficiently continuously differentiable in V (x), - h is such that [x,x + h] V (x). Then f (x + h) = f (x) + n f h i (x) + 1 i=1 x i 2 n k=1 n h k h l l=1 2 f x k x l (x) +... = f (x) + h T f (x) }{{} or [ f (x)] T h + 1 2 ht 2 f (x) h +... 37

A FEW WORDS ABOUT THE TOPOLOGY OF R n R n is a vectorial space for the following operations - Addition: (x + y) i = x i + y i - Multiplication by a scalar: (λx) i = λx i R n is classically equipped with the inner product (x y) = the associated euclidean norm n x i y i = x T y = y T x, i=1 x = (x x) = n xi 2 i=1 and the associated euclidean distance or metric d(x,y) = x y = n (x i y i ) 2 i=1 R n can be equipped with a topological structure. Other definitions can be adopted for the inner product and, as a consequence, for the norm, etc (see chapter 04). 38

Sequences in R n {x k } k N converges to x [ x k x ] iff ( ε > 0)( K N)( k K) : x k x ε In that case, x is called the limit of {x k } k N. x is an accumulation point of {x k } if there is a sub-sequence of {x k } that converges to x. A convergent sequence admits a unique accumulation point (its limit). The converse is not true. 39

Sequences in R n A particular case (n = 1) The upper-limit of {x k } [ limsup x k] is its greatest accumulation point. The lower-limit of {x k } [ liminf x k] is its smallest accumulation point. liminf x k = limsup ( x k) liminf x k = limsup x k = x x k x 40

IN PRACTICE Efficiency of an algorithm Global behaviour An algorithm introduced to solve problem (P) is said globally convergent if, for any starting point x 0, the sequence generated by this algorithm converges to a point which satisfies a necessary optimality condition. Asymptotic (or local) behaviour Hypothesis. {x k } converges to x in R n Objective. To measure the speed of convergence of {x k } for k large (asymptotic behaviour) or, in other words, in a neighbourhood of x (local behaviour). 41

Order (or rate) of convergence The order (or rate of convergence of the sequence {x k } is the greatest positive integer p for which there exists K N and C > 0 such that x k+1 x x C k x p, k K (1) The speed of convergence of {x k } increases with p. If this rate is 2, the convergence is said quadratic. (1) is satisfied if lim k x k+1 x x k x p < 42

p-linear convergence The sequence {x k } is p-linearly convergent if there exists K N and 0 < C < 1 such that x k+1 x C x k x p, k K For p = 1 : linear convergence. {x k } is p-linearly convergent if lim k x k+1 x x k x p = ρ < 1 (2) The speed of convergence of the sequence increases when ρ decreases. The smallest ρ for which (2) holds is the ratio of convergence of the sequence. 43

p-superlinear convergence The sequence {x k } is p-superlinearly convergent if there exists K N and C k 0 such that x k+1 x Ck x k x p, k K {x k } is p-superlinearly convergent if lim k x k+1 x x k x p = 0 For p = 1 : superlinear convergence. 44

IN PRACTICE Efficiency of an algorithm Another approach Another measure of the efficiency of an algorithm introduced to solve problem (P) can be obtained by considering no more the sequence {x k } but well the corresponding { f (x k )}. This approach leads to replace, in the previous definitions, expressions of the form x k x by f (x k ) f (x ). The two approaches are equivalent when f is C 2 at x and 2 f (x ) is positive definite. In other cases, they can differ. 45

COMPLEMENTS ABOUT THE TOPOLOGY OF R n Balls and neighbourhood of a point Let a R n and R > 0 be given. The open ball with center a and radius R is the set B(a,R) = {x R n : x a < R} The corresponding closed ball is B(a,R) = {x R n : x a R} The corresponding sphere is S(a,R) = {x R n : x a = R} A neighbourhood of a point x in R n is a set that contains at least one (open) ball centered on x. 46

COMPLEMENTS ABOUT THE TOPOLOGY OF R n Interior, boundary and closure of a set Let S R n be given. A point x S is an interior point of S iff it admits at least one neighbourhood entirely included in S. The interior of S [denoted int(s)] is the set of all its interior points. The exterior of S is the interior of its complement in R n. The boundary of S [denoted δ(s)] is the set of the points that are neither in its interior nor in its exterior. The closure of S [denoted cl(s)] is the set resulting from the union of its interior with its boundary. 47

COMPLEMENTS ABOUT THE TOPOLOGY OF R n Open and closed sets Let S R n be given. S is open iff it coincides with its interior. S is closed iff it coincides with its closure. S is open iff its complement in R n is closed. The intersection of a finite number of open sets is open ; any union of open sets is open. A closed set contains the limits of its convergent sequences {x k : k N} S S closed x S x k x 48

COMPLEMENTS ABOUT THE TOPOLOGY OF R n Bounded and compact sets Let S R n be given. S is bounded iff it is included in a ball. S is compact iff, from any sequence in S, one can extract a subsequence which converges to a point of S. A set S R n is compact iff it is both closed and bounded. 49

Important results with regard to optimization Theorem (Weierstrass) Assume that f is a continuous (real valued) function defined on a compact set K R n. The problem Minimize f (x) (P K ) subject to x K admits a global optimum. Corollary Assume that f is a continuous (real valued) function defined on R n, such that f (x) + when x + (one says that f is coercive). Then f admits a global minimizer on R n. 50