MULTIPLE GRADIENT DESCENT ALGORITHM (MGDA) FOR MULTIOBJECTIVE OPTIMIZATION
|
|
- Scott Fields
- 5 years ago
- Views:
Transcription
1 applied to MULTIPLE GRADIENT DESCENT ALGORITHM () FOR MULTIOBJECTIVE OPTIMIZATION description and numerical Jean-Antoine Désidéri INRIA Project-Team OPALE Sophia Antipolis Méditerranée Centre I ISMP st International Synposium on Mathematical Programming, Berlin, August 19-24, 2012 PDE-Constrained Optimization and Multi-Level/Multi-Grid Methods ECCOMAS 2012 European Congress on Computational Methods in Applied Sciences and Engineering, Vienna (Austria), September 10-14, 2012 TO16: Computational inverse problems and optimization 1 / 102
2 applied to description and numerical I description and numerical 5 6 I Outline 2 / 102
3 applied to description and numerical I vector: : Pareto optimality Y Υ R N Objective functions to be minimized (reduced): Dominance in efficiency J i (Y ) (i = 1,...,n) (see e.g. K. Miettinen: Nonlinear Multiobjective Optimization, Kluwer Academic Publishers) -point Y 1 dominates design-point Y 2 in efficiency, Y 1 Y 2, iff J i (Y 1 ) J i (Y 2 ) ( i = 1,...,n) and at least one inequality is strict. (relationship of partial order) The designer s Holy Grail: the Pareto set set of "non-dominated design-points" or "Pareto-optimal solutions" The Pareto front image of the Pareto set in the objective function space (bounds the domain of attainable performance) 3 / 102
4 applied to description and numerical Pareto front identification Favorable situation: continuous and convex Pareto front Gradient-based optimization can be used efficiently, if number n of objective functions is not too large Non-convex or discontinuous Pareto fronts do exist Evolutionary strategies or GA s most commonly-used for robustness: NSGA-II (Deb), PAES From: A Fast and Elitist Multiobjective Genetic : NSGA-II, K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, IEEE Transactions in Evolutionary Computation, Vol. 6, No. 2, April I Nondominated solutions with NSGA-II on KUR (testcase by F. Kursawe, 1990) n = 3 f 1 (x) = n 1 10exp i=1 x 2 i + x 2 5 i+1 f 2 (x) = ( n i=1 x i sinx 3 i ) 4 / 102
5 applied to Our general objective description and numerical Construct a robust gradient-based method for Pareto front identification Note : in shape optimization, N is often the dimension of a parameterization, meant to be large as the dimension of a discretization usually is; thus, often, N n. However, the following theoretical results also apply to cases where n > N which can be the case of other situations in engineering optimization. I 5 / 102
6 applied to description and numerical I description and numerical 5 6 I Outline 6 / 102
7 applied to description and numerical The basic question : Constructing a gradient-based cooperative algorithm Knowing the gradients, J i (Y 0 ), of n criteria, assumed to be smooth functions of the design vector Y R N, at a given starting design-point Y 0, can we define a nonzero vector ω R N, in the direction of which the Fréchet derivatives of all criteria have the same sign, ( Ji (Y 0 ),ω ) 0 ( i = 1,2,...,n)? I Answer: Yes, if Y 0 is not Pareto-optimal! Then, ω is locally a descent direction common to all criteria. 7 / 102
8 applied to description and numerical I Notion of Classically, for smooth real-valued functions of N variables Optimality (+ regularity) = Stationarity Now, for smooth R n -valued functions of N variables, which stationarity requirement is to be enforced for Pareto-optimality? Proposition 1: Pareto Stationarity at a design point Y 0 in an open domain B in which the objective-functions are smooth and convex, if there exists a convex combination of the gradients equal to 0: α = {α i } (i=1,,n) s.t. n α i J i (Y 0 n ) = 0, α i 0 ( i), α i = 1 i=1 i=1 Then (to be established subsequently): Pareto-optimality (+ regularity) = 8 / 102
9 applied to General principle description and numerical I Proposition 2: Minimum-norm element in the convex hull Let {u i } (i=1,...,n) be a family of n vectors in R N, and U the so-called convex hull of this family, i.e. the following set of vectors : { } U = u R N / u = n i=1 α i u i ; α i 0 ( i); n i=1 α i = 1 Then, U admits a unique element of minimal norm, ω, and the following holds : u U : (u,ω) ω 2 in which (u,v) denotes the usual scalar product of the vectors u and v. 9 / 102
10 applied to description and numerical I Existence and uniqueness of ω Existence = U closed. Uniqueness = U convex. To establish uniqueness, let ω 1 and ω 2 be two realisations of the minimum: ω 1 = ω 2 = Argmin u U u Then, since U is convex, ε [0,1], ω 1 + εr 12 U (r 12 = ω 2 ω 1 ) and: ω 1 + εr 12 ω 1 Square both sides, and remplace by scalar products: Then ε [0,1] : (ω 1 + εr 12,ω 1 + εr 12 ) (ω 1,ω 1 ) 0 ε [0,1] : 2ε(r 12,ω 1 ) + ε 2 (r 12,r 12 ) 0 As ε 0 +, this condition requires that (r 12,ω 1 ) 0; but then, for ε = 1: unless r 12 = 0, i.e. ω 2 = ω 1. ω 2 2 ω 1 2 = 2(r 12,ω 1 ) + (r 12,r 12 ) > 0 Remark: The identification of the element ω is equivalent to the constrained minimization of a quadratic form in R n, and not R N. 10 / 102
11 applied to First consequence description and numerical I u U : (u,ω) ω 2 Proof: Let u U, and r = u ω. ε [0,1], ω + εr U (convexity); hence: and this requires that: ω + εr 2 ω 2 = 2ε(r,ω) + ε 2 (r,r) 0 (r,ω) = (u ω,ω) 0 11 / 102
12 applied to description and numerical I Proof of at Pareto-optimal design-point Y 0 B (open ball) R N Hyp : {J i (Y )} ( i n) smooth and convex in B ; u i = J i (Y 0 ) ( i n) Without loss of generality, assume J i (Y 0 ) = 0 ( i). Then: Y 0 = ArgminJ n (Y ) subject to: J i (Y) 0 ( i n 1) (1) Y Let U n 1 be the convex hull of the gradients {u 1,u 2,...,u n 1 } and ω n 1 = Argmin u Un 1 u. The vector ω n 1 exists, is unique, and such that: ( u i,ω n 1 ) ωn 1 2 ( i n 1). Two possible situations: 1. ω n 1 = 0, and the objective-functions {J 1,J 2,...,J n 1 } satisfy the Pareto stationarity condition at Y = Y 0. A fortiori, the condition is also satisfied by the whole set of objective-functions. 2. Otherwise ω n 1 0. Let j i (ε) = J i (Y 0 εω n 1 ) (i = 1,...,n 1) so that j i (0) = 0 and j i (0) = ( u i,ω n 1 ) ωn 1 2 < 0, and for sufficiently small strictly-positive ε: j i (ε) = J i (Y 0 εω n 1 ) < 0 ( i n 1) and Slater s qualification condition is satisfied for (1). Thus, the Lagrangian L = J n (Y ) + n 1 i=1 λ i J i (Y ) is stationary w.r.t. Y at Y = Y 0 : n 1 u n + λ i u i = 0 i=1 in which λ i > 0 ( i n 1) since equality constraints hold J i (Y 0 ) = 0 (KKT condition). Normalizing this equation results in the condition. 12 / 102
13 applied to R N Affine and Vector Structures Notation description and numerical I Usually indistinct When distinction necessary, a dotted symbol is used for an affine subset/subspace In particular for the convex hull U the set of points vectors of a given origin O and equipollent to u U point to. Rigorously speaking, the term "convex hull" applies to U. 13 / 102
14 applied to R N Affine and Vector Structures A case where O / U Convex hulls description and numerical Ȧ 2 U (affine) O u 1 u 3 I O u U U (vector) u 2 14 / 102
15 applied to Note that u U: Second consequence description and numerical I u u n = n i=1 α i u i ( n i=1 α i ) n 1 u n = i=1 α i u n,i (u n,i = u i u n ) Thus U A n 1 (or in affine space, U Ȧn 1), where A n 1 is a set of vectors pointing to an affine sub-space Ȧn 1 of dimension at most n 1. Define the orthogonal projection O of O onto Ȧn 1 Since OO Ȧn 1 U: O U ω = OO In this case: u U : (u,ω) = const. = ω 2 15 / 102
16 applied to description and numerical I Second consequence Proof Since O is the orthogonal projection of O onto Ȧn 1, OO Ȧn 1, and in particular OO U. If additionally O U, vector OO is the minimum-norm element of U: ω = OO. In this case, ω can be calculated by ignoring the inequality constraints, i, α i 0, that are automatically satisfied by the solution. Hence, vector ω realizes the minimum of the quadratic form ω 2 = n i=1 α i u i 2 uniquely subject to the equality constraint: n i=1 α i = 1. The Lagrangian is formed with a unique multiplier λ: L(α,λ) = 1 2 ( n i=1 The stationarity condition requires that: i, ) n α i u i, α i u i + λ i=1 ( n α i 1 i=1 L α i = (u i,ω) + λ = 0 = (u i,ω) = λ = const. = ω 2 By convex combination, the result extends straightforwardly to the whole U. ) 16 / 102
17 applied to Application to the case of gradients description and numerical I Suppose u i = J i (Y 0 ) ( i); then: either ω 0, and all criteria admit positive Fréchet derivatives in the direction of ω (all equal if ω belongs to the interior U) or ω = 0, and the current design point Y 0 is Pareto stationary: {α i } i=1,2,...,n (α i 0, i; n α i = 1) so that i=1 n i=1 α i u i = 0 17 / 102
18 applied to description and numerical I Proposition 3: Common Descent Direction By virtue of Propositions 1 and 2 in the particular case where u i = J i = J i(y 0 ) S i (S i : user-supplied scaling constant; S i > 0), two situations are possible at Y = Y 0 : either ω = 0, and the design point Y 0 is Pareto-stationary (or Pareto-optimal); or ω 0, and ω is a descent direction common to all criteria {J i (x)} (i=1,...,n) ; additionally, if ω U, the scalar product (u,ω) (u U), and the Frechet derivatives (u i,ω) are all equal to ω 2. : substitute vector ω to the single-criterion gradient in the steepest-descent method. Proposition 4: Convergence Certain normalization provisions being made, in R N, the Multiple Gradient Descent () converges to a Pareto-stationary point. 18 / 102
19 applied to Convergence proof description and numerical I Define the criteria to be positive (possibly apply an exp-transform), and at (possibly add an exploding term outside of the working ball): Assume these criteria to be continuous. i, J i (Y) as Y. If the iteration stops in a finite number of steps, a Pareto-stationary point has been reached. Otherwise, the iteration continues indefinitely, generating an infinite sequence of design points, {Y k }. The corresponding sequences of criteria are infinite, positive and monotone decreasing. They are bounded. Hence, the sequence of iterates, {Y k }, is itself bounded and it admits a subsequence converging to say Y. Necessarily, Y is a Pareto-stationary point. To establish this, assume instead that ω, which corresponds to Y, is nonzero. Then for each criterion, there exists a stepsize ρ for which, the variation is finite. These criteria are in finite number. Hence, the smallest ρ wlll cause a finite variation to all criteria, and this is in contradiction with the fact that only infinitely-small variations of the criteria are realized from Y. 19 / 102
20 applied to description and numerical I to be solved in U R N Usually, but not necessarily : N n. Parameterization Let Remark 1 : practical determination of vector ω ω = Argmin u U u { n n U = u R N / u = α i u i ; α i 0 ( i); i=1 i=1 α i = σ 2 i (i = 1,...,n) in the convex hull α i = 1 to satisfy trivially the inequality constraints (α i 0, i), and transform the equality constraint, n i=1 α i = 1 into n σ 2 i = 1 σ = (σ 1,σ 2,...,σ n ) S n i=1 where S n is the unit sphere of R n and precisely not R N. } 20 / 102
21 applied to description and numerical I Determining vector ω (cont d) The sphere is easily parameterized using trigonometric functions of n 1 independent arcs φ 1,φ 2,...,φ n 1 : σ 1 = cosφ 1. cosφ 2. cosφ cosφ n 1 σ 2 = sinφ 1. cosφ 2. cosφ cosφ n 1 σ 3 = 1. sinφ 2. cosφ cosφ n 1. σ n 1 = sinφ n 2. cosφ n 1 σ n = sinφ n 1 (Consider only : φ i [0,π/2], i and set : φ 0 = π 2.) Let c i = cos 2 φ i and get: (i = 1,...,n) α 1 = c 1. c 2. c c n 1 α 2 = (1 c 1 ). c 2. c c n 1 α 3 = 1. (1 c 2 ). c c n 1. α n 1 = (1 c n 2 ). c n 1 α n = (1 c n 1 ) (c 0 = 0, and c i [0,1] for all i 1) / 102
22 applied to Determining vector ω (end) The convex hull is thus parameterized in description and numerical I [0,1] n 1 (n : number of criteria) independently of N (dimension of design space). For example, with n = 4 criteria : α 1 = c 1 c 2 c 3 α 2 = (1 c 1 )c 2 c 3 α 3 = (1 c 2 )c 3 α 4 = (1 c 3 ) (c 1,c 2,c 3 ) [0,1] 3 22 / 102
23 applied to description and numerical I u 2 Case n = 2 Remark 2 : appropriate normalization of the gradients may reveal to be essential : u i = J i (Y 0 )/S i Without gradient normalization Then, ω = OO unless the angle < u 1,u 2 > is acute, and the norms u 1, u 2 are very different; in that case, ω is equal the one of smaller norm. ω = OO u 2 u 1 ω = u 2 = ω u 1 OO u 1 OO With normalization: The equality ω = OO holds automatically = General Case EQUAL FRÉCHET DERIVATIVES IF the vectors {u i } are NOT NORMALIZED, those of smaller norms are more influential to determine the direction of vector ω 23 / 102
24 applied to description and numerical I Recall Normalizations being examined u i = J i = J i (Y 0 )/S i (S i > 0) = ω δy = ρω = δj i = ( J i,δy ) = ρs i (u i,ω) (u i,ω) = const. if ω does not lie on the edge of convex hull Standard : u i = J i (Y 0 ) J i (Y 0 ) Equal logorithmic variations (whenever ω is not on edge) : Newton-inspired when limj i = 0 : u i = u i = J i (Y 0 ) J i (Y 0 ) J i J i (Y 0 ) 2 J i (Y 0 ) Newton-inspired when limj i 0 : ( ) max u i = J (k 1) i J (k) i,δ J i (Y 0 ), k : iteration no., δ > 0 (small) J i (Y 0 ) 2 24 / 102
25 applied to description and numerical No, if n > 2 : Is standard normalization sufficient to guarantee that ω = OO? (yielding equal Fréchet derivatives) Unit Sphere I 0 Ȧ 2 u 1 u 2 u 3 O OO / U 25 / 102
26 applied to Experimenting From Adrien ZERBINATI, on-going doctoral thesis Basic testcase : minimize the functions f(x,y) = 4x 2 + y 2 + xy g(x,y) = (x 1) 2 + 3(y 1) 2 description and numerical I 26 / 102
27 applied to Basic testcase Convergence from different initial design points DESIGN SPACE FUNCTION SPACE description and numerical (0.5,2) (1.5,2.5) I (-1.5,-2.5) 27 / 102
28 applied to Fonseca testcase description and numerical Minimize 2 fonctions of 3 variables : ( (x i ± 3 1 ) ) 2 f 1,2 (x 1,x 2,x 3 ) = 1 exp 3 i=1 I 28 / 102
29 applied to Fonseca testcase (cont d) Convergence from initial design points over a sphere description and numerical DESIGN SPACE FUNCTION SPACE I Continuous but nonconvex front 29 / 102
30 applied to Fonseca testcase (end) Compare with Pareto Archived Evolution Strategy (PAES) description and numerical I x 3 DESIGN SPACE PAES 2x50 & 6x12 iterate PAES generate x 1 0 x FUNCTION SPACE iterate PAES generate PAES 2x50 & 6x / 102
31 applied to description and numerical I description and numerical 5 6 I Outline 31 / 102
32 applied to description and numerical I Wing-shape optimization Wave-drag minimization in conjunction with lift maximization Metamodel-assisted (Adrien Zerbinati) Wing-shape geometry extruded from airfoil geometry (initial airfoil : NACA0012), 10 parameters Two-point/two-objective optimization 1 Subsonic Eulerian flow : M = 0.3, α = 8 o, for C L maximization 2 Transonic Eulerian flow : M = 0.83, α = 2 o, for C D minimization - Initial database of 40 design points evolves as follows : 1 Evaluate each new point by two 3D Eulerian flow computations 2 Construct surrogate models for C L and C D (using the entire dataset), and perform to convergence 3 Enrich the database with new points, if appropriate After 6 loops, the database is made of 227 design points (554 flow computations) 32 / 102
33 applied to 6 loops Wing-shape optimization 2 Convergence of dataset description and numerical -LIFT (subsonic) lift coefficient initial database 1st step 2nd step 3rd step 4th step 5th step 6th step 0.7 I drag coefficient DRAG (transonic) 33 / 102
34 applied to Wing-shape optimization 3 Low-drag quasi-pareto-optimal shape description and numerical SUBSONIC TRANSONIC I M = 0.3, α = 8 o M = 0.83, α = 2 o 34 / 102
35 applied to Wing-shape optimization 4 High-lift quasi-pareto-optimal shape description and numerical SUBSONIC TRANSONIC I M = 0.3, α = 8 o M = 0.83, α = 2 o 35 / 102
36 applied to Wing-shape optimization 5 Intermediate quasi-pareto-optimal shape description and numerical SUBSONIC TRANSONIC I M = 0.3, α = 8 o M = 0.83, α = 2 o 36 / 102
37 applied to description and numerical I description and numerical 5 6 I Outline 37 / 102
38 applied to Description of the test case : description and numerical In-house CFD Solver (Num3sis) : Compressible Navier-Stokes (FV/FE) Parallel computing (MPI) CPU cost : 4h on 32 cores CAD and mesh by GMSH Test case : Mach number : 0.1 Reynolds number : Re 2800 Two-criterion shape optimization Fine mesh (some 700,000 points) I 38 / 102
39 applied to description and numerical Velocity Variance U mean = 1 V tot u(cel) Vol(cel) d(cel,p o ) ε First objective function σ 2 vel,x = 1 V tot (U mean,x u x (cel)) 2 Vol(cel) d(cel,p o ) ε σ 2 = σ 2 vel,x + σ2 vel,y + σ2 vel,z I 39 / 102
40 applied to Second objective function description and numerical Pressure loss p k = p mean for inlet (k = i), or outlet (k = o) u k = u mean for in- or out-let p = p i p o + ρ i ui 2 2 ρ ouo 2 2 I 40 / 102
41 applied to parameters description and numerical I 41 / 102
42 applied to on surrogate model description and numerical I 42 / 102
43 applied to description and numerical I 43 / 102
44 applied to description and numerical Pressure loss 2,5 2 1,5 Identifying the Pareto set Nominal Point Initial database 3rd step 6th step 9th step Initial Pareto front Final Pareto front I ,5 0,6 0,8 1 1,2 1,4 Velocity variance Figure: metamodel assisted step by step evolution. Initial and final non-dominated sets. 223 solver calls / 102
45 applied to Velocity Magnitude Streamlines description and numerical 1: BEST VELOCITY VARIANCE 2 : NOMINAL SHAPE I SHAPE COMPARISON 3 : BEST PRESSURE LOSS 45 / 102
46 applied to Velocity magnitude in a section just after the bend : description and numerical 1: BEST VELOCITY VARIANCE 2 : NOMINAL SHAPE I SECTION POSITION 3 : BEST PRESSURE LOSS 46 / 102
47 applied to Velocity Magnitude in output section : description and numerical 1 : BEST VELOCITY VARIANCE 2 : NOMINAL SHAPE I SECTION POSITION 3 : BEST PRESSURE LOSS 47 / 102
48 applied to NOMINAL versus LOWEST VELOCITY VARIANCE description and numerical I 48 / 102
49 applied to NOMINAL versus LOWEST PRESSURE LOSS description and numerical I 49 / 102
50 applied to LOWEST VELOCITY VARIANCE versus LOWEST PRESSURE LOSS description and numerical I 50 / 102
51 applied to description and numerical I description and numerical 5 6 I Outline 51 / 102
52 y applied to Dirichlet problem Discrete solution by standard 2nd-order finite-difference method over qradrangular mesh description and numerical u = f u = 0 over Ω = [ 1,1] [ 1,1] (Γ = Ω) I u_h u_h fort u_h contour lines x y x 52 / 102
53 applied to Domain partitioning Function value controls at 4 interfaces description and numerical I yielding 4 sub-domain Dirichlet problems with 2 controlled interfaces each 53 / 102
54 applied to description and numerical I Formal expression over γ 1 (0 x 1; y = 0): s 1 (x) = u y (x,0+ ) u y (x,0 ) = over γ 2 (x = 0; 0 y 1): s 2 (y) = u x (0+,y) u x (0,y) = over γ 3 ( 1 x 0; y = 0): s 3 (x) = u y (x,0+ ) u y (x,0 ) = over γ 4 (x = 0; 1 y 0): s 4 (y) = u x (0+,y) u x (0,y) = Approximation by 2nd-order one-sided finite-differences Interface jumps [ u1 y u ] 4 (x,0); y [ u1 x u ] 2 (0,y); x [ u2 y u ] 3 (x,0); y [ u4 x u ] 3 (0,y). x 54 / 102
55 applied to description and numerical I Functionals and matching condition Interface functionals 1 J i = 2 s2 i w dγ i := J i (v) γ i that is, explicitly: 1 J 1 = J 3 = s 1(x) 2 w(x)dx J 2 = 1 2 s 3(x) 2 w(x)dx J 4 = s 2(y) 2 w(y)dy (w(t) (t [0, 1]) is an optional weighting function, and w( t) = w(t).) Matching condition J 1 = J 2 = J 3 = J 4 = s 4(y) 2 w(y)dy 55 / 102
56 applied to Adjoint problems description and numerical Eight Dirichlet sub-problems p i = 0 (Ω i ) q i = 0 p i = 0 ( Ω i \γ i ) q i = 0 p i = s i w (γ i ) q i = s i+1 w (Ω i ) ( Ω i \γ i+1 ) (γ i+1 ) I Green s formula and s i u i n w = γ i γ i p i n v i + γ i+1 p i n v i+1 s i+1 u i n w = γ i+1 q i n v i + γ i q i n v i+1 γ i+1 56 / 102
57 applied to description and numerical I Functional gradients 1 1 [ ] J 1 = s 1 (x)s 1(x)w(x)dx u = s 1 (x) y u 4 (x,0)w(x)dx y 1 (p 1 q 4 ) 1 = (x,0)v p y 1(x)dx + 0 x (0,y)v 2(y)dy q x (0,y)v 4(y)dy 1 1 [ ] J 2 = s 2 (y)s 2(y)w(y)dy u = s 2 (y) x u 2 (0,y)w(y)dy x 1 q 1 1 = 0 y (x,0)v 1(x)dx (q 1 p 2 ) 0 + (0,y)v p 2 0 x 2(y)dy + 1 y (x,0)v 3(x)dx 0 0 [ ] J 3 = s 3 (x)s 3(x)w(x)dx u = s 3 (x) y u 3 (x,0)w(x)dx y 1 q 0 2 = 0 x (0,y)v 2(y)dy (q 2 p 3 ) 0 + (x,0)v p 3 1 y 3(x)dx 1 x (0,y)v 4(y)dy 0 0 [ ] J 4 = s 4 (y)s 4(y)w(y)dy u = s 4 (y) x u 3 (0,y)w(y)dy x 1 p 0 4 = 0 y (x,0)v 1(x)dx q y (x,0)v 3(x)dx (p 4 q 3 ) + (0,y)v 1 x 4(y)dy Conclusion: J 4 i = j=1 γ j G i,j v j dγ j (i = 1,...,4); {G i,j } : partial gradients. 57 / 102
58 applied to Other technical details description and numerical Dirichlet sub-problems All 12 (4 direct, 4 2 adjoint) sub-problems are solved by direct inverse (discrete sine-transform) I Integrals u h = (Ω X Ω Y ) (Λ X Λ Y ) 1 (Ω X Ω Y ) f h are approximated by the trapezoidal rule 58 / 102
59 applied to description and numerical Reference method applied to agglomerated criterion Gradient J = 4 i=1 J i 4 J = J i i=1 I Iteration Stepsize is set to εj (l) by fixing v (l+1) = v (l) ρ l J (l) δj = J (l).δv (l) J (l) = ρ l ρ l = εj(l) J (l) / 102
60 applied to Quasi-Newton steepest-descent ε = 1 Convergence history Asymptotic Global description and numerical J_1 J_2 J_3 J_4 J = SUM TOTAL J_1 J_2 J_3 J_4 J = SUM TOTAL e-05 1e-06 1e-05 1e-07 I 1e e-08 1e / 102
61 applied to Quasi-Newton steepest descent ε = 1 description and numerical History of gradients over 200 iterations J/ v 1 J/ v DISCRETIZED FUNCTIONAL GRADIENT OF J = SUM_i J_i W.R.T. V1 DISCRETIZED FUNCTIONAL GRADIENT OF J = SUM_i J_i W.R.T. V2 I / 102
62 applied to Quasi-Newton steepest descent ε = 1 description and numerical I History of gradients over 200 iterations J/ v 3 J/ v DISCRETIZED FUNCTIONAL GRADIENT OF J = SUM_i J_i W.R.T. V3 DISCRETIZED FUNCTIONAL GRADIENT OF J = SUM_i J_i W.R.T. V / 102
63 y applied to Quasi-Newton steepest descent ε = 1 Discrete solution description and numerical u_h u_h u_h(x,y) u_h contour lines I x y x 63 / 102
64 applied to Practical determination of minimum-norm element ω description and numerical I ω = 4 i=1 α i u i u i = J i (Y 0 ) using the following parameterization of the convex hull: α 1 = c 1 c 2 c 3 α 2 = (1 c 1 )c 2 c 3 α 3 = (1 c 2 )c 3 α 4 = (1 c 3 ) and (c 1,c 2,c 3 ) [0,1] 3 are discretized by step of Best set of coefficients retained. 64 / 102
65 applied to Stepsize Iteration v (l+1) = v (l) ρ l ω (l) description and numerical Stepsize is set to εj (l) by fixing δj = J (l).δv (l) = ρ l J (l).ω I In practice: ε = 1. εj(l) ρ l = J (l).ω 65 / 102
66 applied to ε = 1 description and numerical Asymptotic Convergence Unscaled u i = J i Scaled u i = J i J i I 66 / 102
67 applied to ε = 1 description and numerical Global Convergence Unscaled u i = J i Scaled u i = J i J i I 67 / 102
68 applied to First conclusions description and numerical I Scaling essential Somewhat deceiving convergence Who is to blame? the insufficiently accurate determination of ω; the non-optimality of the scaling of gradients; the non-optimality of the step-size, the parameter ε being maintained equal to 1 throughout; the large dimension of the design space, here 76 (4 interfaces associated with 19 d.o.f. s). 68 / 102
69 applied to description and numerical I : a direct algorithm Valid when gradients are linearly independent User-supplied scaling factors {S i } (i=1,,n), and scaled gradients J i = J i(y 0 ) S i (S i > 0; e.g. S i = J i for logarithmic gradients). Perform Gram-Schmidt with special normalization Set u 1 = J 1 For i = 2,...,n, set: u i = J i k<i c i,k u k where: ( ) A i J i,u k k < i : c i,k = ), and: (u k,u k 1 c i,k A i = k<i ε i if nonzero otherwise (ε i arbitrary, but small) 69 / 102
70 applied to Consequences Element ω always in the interior of convex hull description and numerical ω = n i=1 1 α i u i α i = u i 2 n j=1 = 1 u j 2 This implies equal projections of ω onto u k 1 < 1 u 1 + i 2 j i u j 2 I and finally: k : ( u k,ω ) = α k u k 2 = ω 2 ( J i,ω ) = ω 2 ( i) (with a possibly-modified scale S i = (1 + ε i )S i ), that is: the same positive constant. EQUAL FRÉCHET DERIVATIVES FORMED WITH SCALED GRADIENTS 70 / 102
71 applied to Geometrical demonstration description and numerical u 3 u U Given J i = J i (Y 0 )/S i, perform G.-S. process to compute u i = [J i k<i c i,k u k ]/A i ; then: O U = ω = OO = ( u1,ω ) = ( u 2,ω ) = ( u 3,ω ) = ω 2 J i = A i u i + k<i c i,k u ( k = J i,ω ) = ( ) A i + c i,k ω 2 = ω 2 = const. k<i }{{} = 1 by normalization I thru A i O u 1 O u 2 71 / 102
72 applied to description and numerical b: automatic rescale Do not let constant A i become negative; instead define A i = 1 c i,k k<i only if this number is strictly-positive. Otherwise, use modified scale: ( S i = c i,k )S i k<i I ( automatic rescale ), so that: c i,k = ( ) 1 c i,k c i,k, k<i c i,k = 1, k<i and set A i = ε i, for some small ε i. Same formal conclusion: the Fréchet derivatives are equal; but the value is much larger, and (at least) one criterion has been rescaled 72 / 102
73 applied to description and numerical Some open questions ω II = ω? Yes, if n = 2 and angle obtuse No, in general. Is the following implication I limω II = 0 = limω = 0 true? (assuming the limit is the result of the iteration) In which order should we perform the Gram-Schmidt process? Etc 73 / 102
74 applied to ε = 1 description and numerical Global convergence Unscaled u i = J i Scaled u i = J i J i I 74 / 102
75 applied to b ε = 1 description and numerical Global convergence Unscaled u i = J i automatic rescale on Scaled u i = J i J i automatic rescale on I 75 / 102
76 applied to : Recap description and numerical Convergence over 500 iterations Unscaled u i = J i automatic rescale off Scaled u i = J i J i automatic rescale on I 76 / 102
77 applied to description and numerical I description and numerical 5 6 I Outline 77 / 102
78 applied to I (Strelitzia reginae) description and numerical I 78 / 102
79 applied to I (Strelitzia reginae) description and numerical I 79 / 102
80 applied to I (Strelitzia reginae) description and numerical u 1 ω u 2 I 80 / 102
81 applied to I (Strelitzia reginae) description and numerical ω I 81 / 102
82 applied to description and numerical I I - definition 1 Start from scaled gradients, {J i }, and duplicates, {g i}, J i = J i(y 0 ) (S i : user-supplied scale) S i and proceed in 3 steps: A,B and C. A: Initialization Set (justified afterwards) u 1 := g 1 = J k / k = Argmax i min j ( J j,j i ) ( J i,j i ) Set n n lower-triangular array c = {c i,j } (i j) to 0. Set, conservatively, I := n (expected number of computed orthogonal basis vectors). Assign some appropriate value to a cut-off constant a : 0 a < 1. (Note: main diagonal of array c is to contain cumulative row-sums.) 82 / 102
83 applied to description and numerical I I - definition 2 B: Main G.-S. loop; for i = 2,3,...,(at most) n, do: 1. Calculate the i 1st column of coefficients: ( ) gj,u i 1 c j,i 1 = ( ) ( j = i,...,n) ui 1,u i 1 and update the cumulative row-sums: 2. Test: c j,j := c j,j + c j,j 1 = c j,k k<i If the following condition is satisfied c j,j > a ( j = i,...,n) ( j = i,...,n) set I := i 1, and interrupt the Gram-Schmidt process (go to 3.). Otherwise, compute next orthogonal vector u i as follows: 83 / 102
84 applied to description and numerical I I - definition 3 - Identify index l = Argmin j {c j,j / i j n} (Note that c l,l a < 1.) - Permute information associated with i and l : g-vectors g i g l, rows i and l of array c and cumulative row-sums c i,i c l,l. - Set A i = 1 c i,i 1 a > 0 (c i,i = former-c l,l a), and calculate (in which g i = former-g l, c i,k = former-c l,k ). u i = g i k<i c i,k u k A i - If u i 0, return to 1. with incremented i; otherwise: g i = c i,k u k = c i,k g k k<i k<i where the {c i,k } are calculated by backward substitution. Then, if c i,k 0 ( k < i): detected: STOP iteration; otherwise (ambiguous exceptional case): STOP G.-S. process; compute original ω and go to C. 84 / 102
85 applied to I - definition 4 description and numerical I 3. Calculate ω as the minimum-norm element in the convex hull of {u 1,u 2,...,u I }: ω = I i=1 α i u i α i = = u i 2 I 1 u j=1 u j i 2 j i (Note that here all computed u i 0, and 0 < α i < 1.) u j 2 C: If ω < TOL, STOP iteration; otherwise, perform descent step and return to B. 85 / 102
86 applied to description and numerical I If I = n : I - consequences I = b + automatic ordering in Gram-Schmidt process making rescale unnecessary since, by construction i : A i 1 a > 0 Otherwise, I < n (incomplete Gram-Schmidt process): First I Fréchet derivatives: ( gi,ω ) = ( u i,ω ) = ω 2 > 0 ( i = 1,...,I) Subsequent ones (i > I): ω = I j=1 ( gi,ω ) = α j u j g i = I k=1 I k=1 c i,k ( uk,ω ) = c i,k u k + v i v i {u 1,u 2,...,u I } I k=1 c i,k ω 2 = c i,i ω 2 > a ω 2 > 0 86 / 102
87 applied to Choice of g 1 Justification description and numerical At initialization, we have set u 1 := g 1 = J k / k = Argmax i min j ( J j,j i ) ( J i,j i ) I This was equivalent to maximizing c 2,1 = c 2,2, that is, maximizing the least cumulative row-sum, at first estimation. (Recall that the favorable situation is when all cumulative row-sums are positive, or > a.) 87 / 102
88 applied to Expected benefits description and numerical I Automatic ordering and rescale When gradients exhibit a general trend, ω is found in fewer steps, and has a larger norm Larger ω implies larger directional derivatives, and greater efficiency of descent step ( gi, ω ) = ω or a ω ω 88 / 102
89 applied to description and numerical I description and numerical 5 6 I Outline 89 / 102
90 applied to description and numerical I When Hessians are known Adressing the question of scaling For objective function J i (Y ) alone, Newton s iteration writes Y 1 = Y 0 p i H i p i = J i (Y 0 ) Split p i into orthogonal components where: q i = ( ) p i, J i (Y 0 ) Define scaled gradient Apply I p i = q i + r i J i (Y 0 ) 2 J i (Y 0 ) and r i J i (Y 0 ) J i = q i Equivalently, set scaling factor as follows: S i = J i (Y ( 0 ) 2 ) p i, J i (Y 0 ) 90 / 102
91 applied to Variant b Use BFGS iterative estimates in place of exact Hessians description and numerical I H (k+1) i = H(k) i k : iteration index ( i = 1,...,n) 1 H(k) i s (k)t H(k) s (k) i in which: H(0) i = Id s (k) T (k)t s H(k) s (k) = Y (k+1) Y (k) i + z (k) i = J i (Y (k+1) ) J i (Y (k) ) z (k)t i 1 (k) z s (k) i z (k)t i 91 / 102
92 applied to Recommended Step-Size J i (Y 0 ρω) = J i (Y 0 ) ρ ( J i (Y 0 ),ω ) ρ2( H i ω,ω ) description and numerical I δj i := J i (Y 0 ) J i (Y 0 ρω) := A i ρ 1 2 B iρ 2, where: A i = S i a i, B i = S i b i, and: a i = ( J i,ω ) b i = ( H i ω,ω ) /S i. If the scales {S i } are truly relevant for the objective-functions ρ = Argmax ρ min δ i. i where: δ i = δj i S i = a i ρ 1 2 b iρ / 102
93 applied to I = n Recommended Step-Size 2 description and numerical I ρ = ρ I := ω 2 b I, where b I = max i I b i is assumed to be positive. I < n ρ II := a ω 2 b II, where b II = max i>i b i is assumed to be positive. Also 2(1 a) ω 2 ρ = b I b II at which point the two bounding parabolas intersect: ω 2 ρ 1 2 b Iρ 2 = a ω 2 ρ 1 2 b IIρ / 102
94 applied to δ a) I = n Recommended Step-Size 3 b) I < n, ρ > max(ρ δ I,ρ II ) ω 2 ω 2 a ω 2 description and numerical δ ρ I c) I < n, ρ (ρ I,ρ II ) ω 2 a ω 2 ρ δ ρ I ρ II ρ d) I < n, ρ < min(ρ I,ρ II ) ω 2 ρ I ρ I ρ ρ II ρ ρ ρ I ρ II ρ 94 / 102
95 applied to Recommended Step-Size end description and numerical If I = n, ρ = ρ I. If I < n, ρ is the element of the triplet {ρ I,ρ II,ρ } which separates the other two. I 95 / 102
96 applied to description and numerical I description and numerical 5 6 I Outline 96 / 102
97 applied to description and numerical I Multiple-Gradient Descent CONCLUSIONS Theory Notion of Pareto stationarity introduced to characterize "standard Pareto-optimal points" : permits to identify a descent direction common to arbitrary number of criteria, knowing the local gradients in number at most equal to the number of design parameters Proof of convergence of to Pareto-stationary points Capability to identify the Pareto front demonstrated for mathematical test-cases; non-convexity of Pareto front not a problem : a direct procedure, faster and more accurate, when gradients are linearly independent; variant b (with automatic rescale) recommended On-going research: scaling, preconditioning, step-size adjustment, what to do to extend to cases of linearly-dependent gradients, / 102
98 applied to CONCLUSIONS Applications description and numerical I Implementation and first experiments Validation with analytical test cases and comparison with a Pareto capturing method (PAES) Extension to Meta-Model-Assisted and Automobile Cooling-System Successful design experiments with 3D compressible flow (Euler and Navier-Stokes) On-going development of meta-model-assisted and testing of 3D external aerodynamics case to be presented at ECCOMAS 2012, Vienna, Austria) 98 / 102
99 applied to CONCLUSIONS Applications description and numerical I Domain-partitioning model problem The method works, but it is not as cost-efficient as the standard quasi-newton method applied to the agglomerated criterion; but the test-case was peculiar in several ways: Pareto front and set reduced to a singleton large dimension of the design space (4 criteria, 76 design variables) robust procedure to adjust the step-size needed determination of ω in the basic method should be made more accurately, perhaps using iterative refinement scaling of gradients found important but never really analyzed completely (logarithmic gradients appropriate for vanishing criteria) general preconditioning not yet clear (on-going) The direct variant found faster and more robust; with automatic rescaling procedure on, b seems to exhibit quadratic convergence. 99 / 102
100 applied to description and numerical I description and numerical 5 6 I Outline 100 / 102
101 applied to Flirting with the Pareto front is combined with a locally-adapted Nash game based on a territory splitting that preserves to second-order the condition Cooperation AND Competition description and numerical I COOPERATION () : The design point is steered to the Pareto front. COMPETITION (Nash game) : at start (from the Pareto front) : α 1 J 1 + α 2 J 2 = 0; then let : J A = α 1 J 1 + α 2 J 2, et J B = J 2, and adapt territory-splitting to best preserve J A ; then : the trajectory remains tangent to the Pareto front. 101 / 102
102 applied to Some references description and numerical I Reports from INRIA Open Archive: M. G. D. A., INRIA Research Report No (2009), Revised version, October 2012 (Open Archive : Related INRIA Research Reports: Nos (2011), 7922 (2012), 7968 (2012), 8068 (2012) Other C. R. Acad. Sci. Paris, Ser. I 350 (2012) / 102
MUTIPLE-GRADIENT DESCENT ALGORITHM FOR MULTIOBJECTIVE OPTIMIZATION
MUTIPLE-GRADIENT DESCENT ALGORITHM FOR MULTIOBJECTIVE OPTIMIZATION Jean-Antoine Désidéri To cite this version: Jean-Antoine Désidéri. MUTIPLE-GRADIENT DESCENT ALGORITHM FOR MULTIOB- JECTIVE OPTIMIZATION.
More informationApplication of MGDA to domain partitioning
Application of MGDA to domain partitioning Jean-Antoine Désidéri To cite this version: Jean-Antoine Désidéri. Application of MGDA to domain partitioning. [Research Report] RR- 7968, INRIA. 22, pp.34.
More informationMGDA Variants for Multi-Objective Optimization
MGDA Variants for Multi-Objective Optimization Jean-Antoine Désidéri RESEARCH REPORT N 8068 September 17, 2012 Project-Team Opale ISSN 0249-6399 ISRN INRIA/RR--8068--FR+ENG MGDA Variants for Multi-Objective
More informationMGDA II: A direct method for calculating a descent direction common to several criteria
MGDA II: A direct method for calculating a descent direction common to several criteria Jean-Antoine Désidéri To cite this version: Jean-Antoine Désidéri. MGDA II: A direct method for calculating a descent
More informationHaris Malik. Presentation on Internship 17 th July, 2009
Haris Malik University of Nice-Sophia Antipolics [math.unice.fr] Presentation on Internship 17 th July, 2009 M. Haris University of Nice-Sophia Antipolis 1 Internship Period Internship lasted from the
More informationNonlinear Optimization for Optimal Control
Nonlinear Optimization for Optimal Control Pieter Abbeel UC Berkeley EECS Many slides and figures adapted from Stephen Boyd [optional] Boyd and Vandenberghe, Convex Optimization, Chapters 9 11 [optional]
More informationKernel Methods. Machine Learning A W VO
Kernel Methods Machine Learning A 708.063 07W VO Outline 1. Dual representation 2. The kernel concept 3. Properties of kernels 4. Examples of kernel machines Kernel PCA Support vector regression (Relevance
More informationHIERARCHICAL SHAPE OPTIMIZATION : Cooperation and Competition in Multi-Disciplinary Approaches
, Concurrent HIERARCHICAL SHAPE OPTIMIZATION : Cooperation and Competition in Multi-Disciplinary Approaches INRIA Project-Team OPALE Sophia Antipolis Méditerranée Center (France) http://www-sop.inria.fr/opale
More informationIterative Methods for Solving A x = b
Iterative Methods for Solving A x = b A good (free) online source for iterative methods for solving A x = b is given in the description of a set of iterative solvers called templates found at netlib: http
More informationConstrained Optimization and Lagrangian Duality
CIS 520: Machine Learning Oct 02, 2017 Constrained Optimization and Lagrangian Duality Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may
More informationDS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.
DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1
More informationNonlinear Programming
Nonlinear Programming Kees Roos e-mail: C.Roos@ewi.tudelft.nl URL: http://www.isa.ewi.tudelft.nl/ roos LNMB Course De Uithof, Utrecht February 6 - May 8, A.D. 2006 Optimization Group 1 Outline for week
More informationStructural and Multidisciplinary Optimization. P. Duysinx and P. Tossings
Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019 CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be
More informationChapter 8 Gradient Methods
Chapter 8 Gradient Methods An Introduction to Optimization Spring, 2014 Wei-Ta Chu 1 Introduction Recall that a level set of a function is the set of points satisfying for some constant. Thus, a point
More informationOptimization of an Unsteady System Governed by PDEs using a Multi-Objective Descent Method
Optimization of an Unsteady System Governed by PDEs using a Multi-Objective Descent Method Camilla Fiorini, Régis Duvigneau, Jean-Antoine Désidéri To cite this version: Camilla Fiorini, Régis Duvigneau,
More informationGame Theory and its Applications to Networks - Part I: Strict Competition
Game Theory and its Applications to Networks - Part I: Strict Competition Corinne Touati Master ENS Lyon, Fall 200 What is Game Theory and what is it for? Definition (Roger Myerson, Game Theory, Analysis
More informationNumerical optimization
Numerical optimization Lecture 4 Alexander & Michael Bronstein tosca.cs.technion.ac.il/book Numerical geometry of non-rigid shapes Stanford University, Winter 2009 2 Longest Slowest Shortest Minimal Maximal
More informationNumerical Optimization Professor Horst Cerjak, Horst Bischof, Thomas Pock Mat Vis-Gra SS09
Numerical Optimization 1 Working Horse in Computer Vision Variational Methods Shape Analysis Machine Learning Markov Random Fields Geometry Common denominator: optimization problems 2 Overview of Methods
More information5 Handling Constraints
5 Handling Constraints Engineering design optimization problems are very rarely unconstrained. Moreover, the constraints that appear in these problems are typically nonlinear. This motivates our interest
More informationConvex Optimization. Problem set 2. Due Monday April 26th
Convex Optimization Problem set 2 Due Monday April 26th 1 Gradient Decent without Line-search In this problem we will consider gradient descent with predetermined step sizes. That is, instead of determining
More informationNumerical optimization. Numerical optimization. Longest Shortest where Maximal Minimal. Fastest. Largest. Optimization problems
1 Numerical optimization Alexander & Michael Bronstein, 2006-2009 Michael Bronstein, 2010 tosca.cs.technion.ac.il/book Numerical optimization 048921 Advanced topics in vision Processing and Analysis of
More informationIntroduction to Optimization Techniques. Nonlinear Optimization in Function Spaces
Introduction to Optimization Techniques Nonlinear Optimization in Function Spaces X : T : Gateaux and Fréchet Differentials Gateaux and Fréchet Differentials a vector space, Y : a normed space transformation
More informationUnconstrained optimization
Chapter 4 Unconstrained optimization An unconstrained optimization problem takes the form min x Rnf(x) (4.1) for a target functional (also called objective function) f : R n R. In this chapter and throughout
More informationA high-order discontinuous Galerkin solver for 3D aerodynamic turbulent flows
A high-order discontinuous Galerkin solver for 3D aerodynamic turbulent flows F. Bassi, A. Crivellini, D. A. Di Pietro, S. Rebay Dipartimento di Ingegneria Industriale, Università di Bergamo CERMICS-ENPC
More informationOptimal control problems with PDE constraints
Optimal control problems with PDE constraints Maya Neytcheva CIM, October 2017 General framework Unconstrained optimization problems min f (q) q x R n (real vector) and f : R n R is a smooth function.
More informationStability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games
Stability of Feedback Solutions for Infinite Horizon Noncooperative Differential Games Alberto Bressan ) and Khai T. Nguyen ) *) Department of Mathematics, Penn State University **) Department of Mathematics,
More informationConvex Optimization. Newton s method. ENSAE: Optimisation 1/44
Convex Optimization Newton s method ENSAE: Optimisation 1/44 Unconstrained minimization minimize f(x) f convex, twice continuously differentiable (hence dom f open) we assume optimal value p = inf x f(x)
More informationIterative methods for Linear System
Iterative methods for Linear System JASS 2009 Student: Rishi Patil Advisor: Prof. Thomas Huckle Outline Basics: Matrices and their properties Eigenvalues, Condition Number Iterative Methods Direct and
More informationConvex Optimization M2
Convex Optimization M2 Lecture 3 A. d Aspremont. Convex Optimization M2. 1/49 Duality A. d Aspremont. Convex Optimization M2. 2/49 DMs DM par email: dm.daspremont@gmail.com A. d Aspremont. Convex Optimization
More informationOptimization. Charles J. Geyer School of Statistics University of Minnesota. Stat 8054 Lecture Notes
Optimization Charles J. Geyer School of Statistics University of Minnesota Stat 8054 Lecture Notes 1 One-Dimensional Optimization Look at a graph. Grid search. 2 One-Dimensional Zero Finding Zero finding
More informationAlgorithms for Nonsmooth Optimization
Algorithms for Nonsmooth Optimization Frank E. Curtis, Lehigh University presented at Center for Optimization and Statistical Learning, Northwestern University 2 March 2018 Algorithms for Nonsmooth Optimization
More informationGradient Descent. Dr. Xiaowei Huang
Gradient Descent Dr. Xiaowei Huang https://cgi.csc.liv.ac.uk/~xiaowei/ Up to now, Three machine learning algorithms: decision tree learning k-nn linear regression only optimization objectives are discussed,
More informationInterior Point Methods in Mathematical Programming
Interior Point Methods in Mathematical Programming Clóvis C. Gonzaga Federal University of Santa Catarina, Brazil Journées en l honneur de Pierre Huard Paris, novembre 2008 01 00 11 00 000 000 000 000
More informationMath 164-1: Optimization Instructor: Alpár R. Mészáros
Math 164-1: Optimization Instructor: Alpár R. Mészáros First Midterm, April 20, 2016 Name (use a pen): Student ID (use a pen): Signature (use a pen): Rules: Duration of the exam: 50 minutes. By writing
More informationwhich arises when we compute the orthogonal projection of a vector y in a subspace with an orthogonal basis. Hence assume that P y = A ij = x j, x i
MODULE 6 Topics: Gram-Schmidt orthogonalization process We begin by observing that if the vectors {x j } N are mutually orthogonal in an inner product space V then they are necessarily linearly independent.
More informationModule III: Partial differential equations and optimization
Module III: Partial differential equations and optimization Martin Berggren Department of Information Technology Uppsala University Optimization for differential equations Content Martin Berggren (UU)
More informationmin f(x). (2.1) Objectives consisting of a smooth convex term plus a nonconvex regularization term;
Chapter 2 Gradient Methods The gradient method forms the foundation of all of the schemes studied in this book. We will provide several complementary perspectives on this algorithm that highlight the many
More informationAn Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints
An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints Klaus Schittkowski Department of Computer Science, University of Bayreuth 95440 Bayreuth, Germany e-mail:
More informationA Brief Review on Convex Optimization
A Brief Review on Convex Optimization 1 Convex set S R n is convex if x,y S, λ,µ 0, λ+µ = 1 λx+µy S geometrically: x,y S line segment through x,y S examples (one convex, two nonconvex sets): A Brief Review
More informationDG Methods for Aerodynamic Flows: Higher Order, Error Estimation and Adaptive Mesh Refinement
HONOM 2011 in Trento DG Methods for Aerodynamic Flows: Higher Order, Error Estimation and Adaptive Mesh Refinement Institute of Aerodynamics and Flow Technology DLR Braunschweig 11. April 2011 1 / 35 Research
More informationWritten Examination
Division of Scientific Computing Department of Information Technology Uppsala University Optimization Written Examination 202-2-20 Time: 4:00-9:00 Allowed Tools: Pocket Calculator, one A4 paper with notes
More informationOutline Introduction: Problem Description Diculties Algebraic Structure: Algebraic Varieties Rank Decient Toeplitz Matrices Constructing Lower Rank St
Structured Lower Rank Approximation by Moody T. Chu (NCSU) joint with Robert E. Funderlic (NCSU) and Robert J. Plemmons (Wake Forest) March 5, 1998 Outline Introduction: Problem Description Diculties Algebraic
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1396 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1396 1 / 44 Table
More informationOctober 25, 2013 INNER PRODUCT SPACES
October 25, 2013 INNER PRODUCT SPACES RODICA D. COSTIN Contents 1. Inner product 2 1.1. Inner product 2 1.2. Inner product spaces 4 2. Orthogonal bases 5 2.1. Existence of an orthogonal basis 7 2.2. Orthogonal
More informationHomework 5. Convex Optimization /36-725
Homework 5 Convex Optimization 10-725/36-725 Due Tuesday November 22 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)
More informationMath 471 (Numerical methods) Chapter 3 (second half). System of equations
Math 47 (Numerical methods) Chapter 3 (second half). System of equations Overlap 3.5 3.8 of Bradie 3.5 LU factorization w/o pivoting. Motivation: ( ) A I Gaussian Elimination (U L ) where U is upper triangular
More informationCSCI : Optimization and Control of Networks. Review on Convex Optimization
CSCI7000-016: Optimization and Control of Networks Review on Convex Optimization 1 Convex set S R n is convex if x,y S, λ,µ 0, λ+µ = 1 λx+µy S geometrically: x,y S line segment through x,y S examples (one
More information1. Nonlinear Equations. This lecture note excerpted parts from Michael Heath and Max Gunzburger. f(x) = 0
Numerical Analysis 1 1. Nonlinear Equations This lecture note excerpted parts from Michael Heath and Max Gunzburger. Given function f, we seek value x for which where f : D R n R n is nonlinear. f(x) =
More informationNumerical Methods I Non-Square and Sparse Linear Systems
Numerical Methods I Non-Square and Sparse Linear Systems Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 September 25th, 2014 A. Donev (Courant
More informationMathematical optimization
Optimization Mathematical optimization Determine the best solutions to certain mathematically defined problems that are under constrained determine optimality criteria determine the convergence of the
More informationNotes for CS542G (Iterative Solvers for Linear Systems)
Notes for CS542G (Iterative Solvers for Linear Systems) Robert Bridson November 20, 2007 1 The Basics We re now looking at efficient ways to solve the linear system of equations Ax = b where in this course,
More informationSeptember Math Course: First Order Derivative
September Math Course: First Order Derivative Arina Nikandrova Functions Function y = f (x), where x is either be a scalar or a vector of several variables (x,..., x n ), can be thought of as a rule which
More informationExamination paper for TMA4180 Optimization I
Department of Mathematical Sciences Examination paper for TMA4180 Optimization I Academic contact during examination: Phone: Examination date: 26th May 2016 Examination time (from to): 09:00 13:00 Permitted
More informationNumerical Methods I Solving Square Linear Systems: GEM and LU factorization
Numerical Methods I Solving Square Linear Systems: GEM and LU factorization Aleksandar Donev Courant Institute, NYU 1 donev@courant.nyu.edu 1 MATH-GA 2011.003 / CSCI-GA 2945.003, Fall 2014 September 18th,
More informationIs My CFD Mesh Adequate? A Quantitative Answer
Is My CFD Mesh Adequate? A Quantitative Answer Krzysztof J. Fidkowski Gas Dynamics Research Colloqium Aerospace Engineering Department University of Michigan January 26, 2011 K.J. Fidkowski (UM) GDRC 2011
More informationGeneralization of Dominance Relation-Based Replacement Rules for Memetic EMO Algorithms
Generalization of Dominance Relation-Based Replacement Rules for Memetic EMO Algorithms Tadahiko Murata 1, Shiori Kaige 2, and Hisao Ishibuchi 2 1 Department of Informatics, Kansai University 2-1-1 Ryozenji-cho,
More informationSECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS
SECTION: CONTINUOUS OPTIMISATION LECTURE 4: QUASI-NEWTON METHODS HONOUR SCHOOL OF MATHEMATICS, OXFORD UNIVERSITY HILARY TERM 2005, DR RAPHAEL HAUSER 1. The Quasi-Newton Idea. In this lecture we will discuss
More informationPrimal/Dual Decomposition Methods
Primal/Dual Decomposition Methods Daniel P. Palomar Hong Kong University of Science and Technology (HKUST) ELEC5470 - Convex Optimization Fall 2018-19, HKUST, Hong Kong Outline of Lecture Subgradients
More informationOptimization. Escuela de Ingeniería Informática de Oviedo. (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30
Optimization Escuela de Ingeniería Informática de Oviedo (Dpto. de Matemáticas-UniOvi) Numerical Computation Optimization 1 / 30 Unconstrained optimization Outline 1 Unconstrained optimization 2 Constrained
More informationNumerical Solution of Partial Differential Equations governing compressible flows
Numerical Solution of Partial Differential Equations governing compressible flows Praveen. C praveen@math.tifrbng.res.in Tata Institute of Fundamental Research Center for Applicable Mathematics Bangalore
More informationStabilization and Acceleration of Algebraic Multigrid Method
Stabilization and Acceleration of Algebraic Multigrid Method Recursive Projection Algorithm A. Jemcov J.P. Maruszewski Fluent Inc. October 24, 2006 Outline 1 Need for Algorithm Stabilization and Acceleration
More informationThe convergence of stationary iterations with indefinite splitting
The convergence of stationary iterations with indefinite splitting Michael C. Ferris Joint work with: Tom Rutherford and Andy Wathen University of Wisconsin, Madison 6th International Conference on Complementarity
More informationNeural Network Training
Neural Network Training Sargur Srihari Topics in Network Training 0. Neural network parameters Probabilistic problem formulation Specifying the activation and error functions for Regression Binary classification
More informationPreliminary draft only: please check for final version
ARE211, Fall2012 CALCULUS4: THU, OCT 11, 2012 PRINTED: AUGUST 22, 2012 (LEC# 15) Contents 3. Univariate and Multivariate Differentiation (cont) 1 3.6. Taylor s Theorem (cont) 2 3.7. Applying Taylor theory:
More informationOnly Intervals Preserve the Invertibility of Arithmetic Operations
Only Intervals Preserve the Invertibility of Arithmetic Operations Olga Kosheleva 1 and Vladik Kreinovich 2 1 Department of Electrical and Computer Engineering 2 Department of Computer Science University
More informationNumerical Methods for Large-Scale Nonlinear Systems
Numerical Methods for Large-Scale Nonlinear Systems Handouts by Ronald H.W. Hoppe following the monograph P. Deuflhard Newton Methods for Nonlinear Problems Springer, Berlin-Heidelberg-New York, 2004 Num.
More information5. Duality. Lagrangian
5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized
More informationMatrices and systems of linear equations
Matrices and systems of linear equations Samy Tindel Purdue University Differential equations and linear algebra - MA 262 Taken from Differential equations and linear algebra by Goode and Annin Samy T.
More informationAM 205: lecture 19. Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods
AM 205: lecture 19 Last time: Conditions for optimality Today: Newton s method for optimization, survey of optimization methods Optimality Conditions: Equality Constrained Case As another example of equality
More informationLecture 3. Optimization Problems and Iterative Algorithms
Lecture 3 Optimization Problems and Iterative Algorithms January 13, 2016 This material was jointly developed with Angelia Nedić at UIUC for IE 598ns Outline Special Functions: Linear, Quadratic, Convex
More informationCONVERGENCE PROPERTIES OF COMBINED RELAXATION METHODS
CONVERGENCE PROPERTIES OF COMBINED RELAXATION METHODS Igor V. Konnov Department of Applied Mathematics, Kazan University Kazan 420008, Russia Preprint, March 2002 ISBN 951-42-6687-0 AMS classification:
More informationEvolutionary Multiobjective. Optimization Methods for the Shape Design of Industrial Electromagnetic Devices. P. Di Barba, University of Pavia, Italy
Evolutionary Multiobjective Optimization Methods for the Shape Design of Industrial Electromagnetic Devices P. Di Barba, University of Pavia, Italy INTRODUCTION Evolutionary Multiobjective Optimization
More informationConditional Gradient (Frank-Wolfe) Method
Conditional Gradient (Frank-Wolfe) Method Lecturer: Aarti Singh Co-instructor: Pradeep Ravikumar Convex Optimization 10-725/36-725 1 Outline Today: Conditional gradient method Convergence analysis Properties
More informationARE202A, Fall Contents
ARE202A, Fall 2005 LECTURE #2: WED, NOV 6, 2005 PRINT DATE: NOVEMBER 2, 2005 (NPP2) Contents 5. Nonlinear Programming Problems and the Kuhn Tucker conditions (cont) 5.2. Necessary and sucient conditions
More informationOn John type ellipsoids
On John type ellipsoids B. Klartag Tel Aviv University Abstract Given an arbitrary convex symmetric body K R n, we construct a natural and non-trivial continuous map u K which associates ellipsoids to
More informationLecture 10: Dimension Reduction Techniques
Lecture 10: Dimension Reduction Techniques Radu Balan Department of Mathematics, AMSC, CSCAMM and NWC University of Maryland, College Park, MD April 17, 2018 Input Data It is assumed that there is a set
More informationConvex Functions and Optimization
Chapter 5 Convex Functions and Optimization 5.1 Convex Functions Our next topic is that of convex functions. Again, we will concentrate on the context of a map f : R n R although the situation can be generalized
More informationNear-Potential Games: Geometry and Dynamics
Near-Potential Games: Geometry and Dynamics Ozan Candogan, Asuman Ozdaglar and Pablo A. Parrilo September 6, 2011 Abstract Potential games are a special class of games for which many adaptive user dynamics
More informationOptimization and Root Finding. Kurt Hornik
Optimization and Root Finding Kurt Hornik Basics Root finding and unconstrained smooth optimization are closely related: Solving ƒ () = 0 can be accomplished via minimizing ƒ () 2 Slide 2 Basics Root finding
More information1 Computing with constraints
Notes for 2017-04-26 1 Computing with constraints Recall that our basic problem is minimize φ(x) s.t. x Ω where the feasible set Ω is defined by equality and inequality conditions Ω = {x R n : c i (x)
More informationConstrained Optimization
1 / 22 Constrained Optimization ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University March 30, 2015 2 / 22 1. Equality constraints only 1.1 Reduced gradient 1.2 Lagrange
More informationIntroduction to Real Analysis Alternative Chapter 1
Christopher Heil Introduction to Real Analysis Alternative Chapter 1 A Primer on Norms and Banach Spaces Last Updated: March 10, 2018 c 2018 by Christopher Heil Chapter 1 A Primer on Norms and Banach Spaces
More informationMultipoint secant and interpolation methods with nonmonotone line search for solving systems of nonlinear equations
Multipoint secant and interpolation methods with nonmonotone line search for solving systems of nonlinear equations Oleg Burdakov a,, Ahmad Kamandi b a Department of Mathematics, Linköping University,
More informationLinear Solvers. Andrew Hazel
Linear Solvers Andrew Hazel Introduction Thus far we have talked about the formulation and discretisation of physical problems...... and stopped when we got to a discrete linear system of equations. Introduction
More informationLinear & nonlinear classifiers
Linear & nonlinear classifiers Machine Learning Hamid Beigy Sharif University of Technology Fall 1394 Hamid Beigy (Sharif University of Technology) Linear & nonlinear classifiers Fall 1394 1 / 34 Table
More informationNumerical Optimization
Numerical Optimization Unit 2: Multivariable optimization problems Che-Rung Lee Scribe: February 28, 2011 (UNIT 2) Numerical Optimization February 28, 2011 1 / 17 Partial derivative of a two variable function
More informationChapter 3 Numerical Methods
Chapter 3 Numerical Methods Part 2 3.2 Systems of Equations 3.3 Nonlinear and Constrained Optimization 1 Outline 3.2 Systems of Equations 3.3 Nonlinear and Constrained Optimization Summary 2 Outline 3.2
More informationThe Conjugate Gradient Method
The Conjugate Gradient Method Lecture 5, Continuous Optimisation Oxford University Computing Laboratory, HT 2006 Notes by Dr Raphael Hauser (hauser@comlab.ox.ac.uk) The notion of complexity (per iteration)
More informationCoordinate descent. Geoff Gordon & Ryan Tibshirani Optimization /
Coordinate descent Geoff Gordon & Ryan Tibshirani Optimization 10-725 / 36-725 1 Adding to the toolbox, with stats and ML in mind We ve seen several general and useful minimization tools First-order methods
More informationORIE 6326: Convex Optimization. Quasi-Newton Methods
ORIE 6326: Convex Optimization Quasi-Newton Methods Professor Udell Operations Research and Information Engineering Cornell April 10, 2017 Slides on steepest descent and analysis of Newton s method adapted
More informationA REVIEW OF OPTIMIZATION
1 OPTIMAL DESIGN OF STRUCTURES (MAP 562) G. ALLAIRE December 17th, 2014 Department of Applied Mathematics, Ecole Polytechnique CHAPTER III A REVIEW OF OPTIMIZATION 2 DEFINITIONS Let V be a Banach space,
More informationLecture Notes on Support Vector Machine
Lecture Notes on Support Vector Machine Feng Li fli@sdu.edu.cn Shandong University, China 1 Hyperplane and Margin In a n-dimensional space, a hyper plane is defined by ω T x + b = 0 (1) where ω R n is
More information6.252 NONLINEAR PROGRAMMING LECTURE 10 ALTERNATIVES TO GRADIENT PROJECTION LECTURE OUTLINE. Three Alternatives/Remedies for Gradient Projection
6.252 NONLINEAR PROGRAMMING LECTURE 10 ALTERNATIVES TO GRADIENT PROJECTION LECTURE OUTLINE Three Alternatives/Remedies for Gradient Projection Two-Metric Projection Methods Manifold Suboptimization Methods
More informationTwo hours. To be provided by Examinations Office: Mathematical Formula Tables. THE UNIVERSITY OF MANCHESTER. xx xxxx 2017 xx:xx xx.
Two hours To be provided by Examinations Office: Mathematical Formula Tables. THE UNIVERSITY OF MANCHESTER CONVEX OPTIMIZATION - SOLUTIONS xx xxxx 27 xx:xx xx.xx Answer THREE of the FOUR questions. If
More informationDO NOT OPEN THIS QUESTION BOOKLET UNTIL YOU ARE TOLD TO DO SO
QUESTION BOOKLET EECS 227A Fall 2009 Midterm Tuesday, Ocotober 20, 11:10-12:30pm DO NOT OPEN THIS QUESTION BOOKLET UNTIL YOU ARE TOLD TO DO SO You have 80 minutes to complete the midterm. The midterm consists
More informationCSCI 1951-G Optimization Methods in Finance Part 09: Interior Point Methods
CSCI 1951-G Optimization Methods in Finance Part 09: Interior Point Methods March 23, 2018 1 / 35 This material is covered in S. Boyd, L. Vandenberge s book Convex Optimization https://web.stanford.edu/~boyd/cvxbook/.
More informationConvex Optimization Boyd & Vandenberghe. 5. Duality
5. Duality Convex Optimization Boyd & Vandenberghe Lagrange dual problem weak and strong duality geometric interpretation optimality conditions perturbation and sensitivity analysis examples generalized
More informationA Multi-Dimensional Limiter for Hybrid Grid
APCOM & ISCM 11-14 th December, 2013, Singapore A Multi-Dimensional Limiter for Hybrid Grid * H. W. Zheng ¹ 1 State Key Laboratory of High Temperature Gas Dynamics, Institute of Mechanics, Chinese Academy
More informationLecture 15 Newton Method and Self-Concordance. October 23, 2008
Newton Method and Self-Concordance October 23, 2008 Outline Lecture 15 Self-concordance Notion Self-concordant Functions Operations Preserving Self-concordance Properties of Self-concordant Functions Implications
More informationTangent spaces, normals and extrema
Chapter 3 Tangent spaces, normals and extrema If S is a surface in 3-space, with a point a S where S looks smooth, i.e., without any fold or cusp or self-crossing, we can intuitively define the tangent
More information