MGDA II: A direct method for calculating a descent direction common to several criteria

Similar documents
MGDA Variants for Multi-Objective Optimization

Optimization of an Unsteady System Governed by PDEs using a Multi-Objective Descent Method

Numerical modification of atmospheric models to include the feedback of oceanic currents on air-sea fluxes in ocean-atmosphere coupled models

Best linear unbiased prediction when error vector is correlated with other random vectors in the model

Easter bracelets for years

A proximal approach to the inversion of ill-conditioned matrices

The Core of a coalitional exchange economy

The 0-1 Outcomes Feature Selection Problem : a Chi-2 Approach

Case report on the article Water nanoelectrolysis: A simple model, Journal of Applied Physics (2017) 122,

Methylation-associated PHOX2B gene silencing is a rare event in human neuroblastoma.

Bound on Run of Zeros and Ones for Images of Floating-Point Numbers by Algebraic Functions

Analysis of Boyer and Moore s MJRTY algorithm

A new simple recursive algorithm for finding prime numbers using Rosser s theorem

Delaunay triangulation of a random sample of a good sample has linear size

Smart Bolometer: Toward Monolithic Bolometer with Smart Functions

On Symmetric Norm Inequalities And Hermitian Block-Matrices

Spatial representativeness of an air quality monitoring station. Application to NO2 in urban areas

Identifying codes for infinite triangular grids with a finite number of rows

Widely Linear Estimation with Complex Data

Thomas Lugand. To cite this version: HAL Id: tel

On infinite permutations

Vibro-acoustic simulation of a car window

Dispersion relation results for VCS at JLab

About partial probabilistic information

On Symmetric Norm Inequalities And Hermitian Block-Matrices

New estimates for the div-curl-grad operators and elliptic problems with L1-data in the half-space

Completeness of the Tree System for Propositional Classical Logic

Some explanations about the IWLS algorithm to fit generalized linear models

ON THE UNIQUENESS IN THE 3D NAVIER-STOKES EQUATIONS

On size, radius and minimum degree

A Context free language associated with interval maps

Analyzing large-scale spike trains data with spatio-temporal constraints

Methods for territorial intelligence.

Bounded Normal Approximation in Highly Reliable Markovian Systems

Replicator Dynamics and Correlated Equilibrium

Quasi-periodic solutions of the 2D Euler equation

Can we reduce health inequalities? An analysis of the English strategy ( )

Reduced Models (and control) of in-situ decontamination of large water resources

Finite Volume for Fusion Simulations

On Poincare-Wirtinger inequalities in spaces of functions of bounded variation

A generalization of Cramér large deviations for martingales

Impedance Transmission Conditions for the Electric Potential across a Highly Conductive Casing

Notes on Birkhoff-von Neumann decomposition of doubly stochastic matrices

Passerelle entre les arts : la sculpture sonore

Some consequences of the analytical theory of the ferromagnetic hysteresis

Evolution of the cooperation and consequences of a decrease in plant diversity on the root symbiont diversity

Application of MGDA to domain partitioning

Stickelberger s congruences for absolute norms of relative discriminants

A simple kinetic equation of swarm formation: blow up and global existence

Exact Comparison of Quadratic Irrationals

Cutwidth and degeneracy of graphs

Towards an active anechoic room

L institution sportive : rêve et illusion

Basic concepts and models in continuum damage mechanics

On the uniform Poincaré inequality

The Windy Postman Problem on Series-Parallel Graphs

On Newton-Raphson iteration for multiplicative inverses modulo prime powers

Unbiased minimum variance estimation for systems with unknown exogenous inputs

On the longest path in a recursively partitionable graph

MULTIPLE GRADIENT DESCENT ALGORITHM (MGDA) FOR MULTIOBJECTIVE OPTIMIZATION

Finite volume method for nonlinear transmission problems

From Unstructured 3D Point Clouds to Structured Knowledge - A Semantics Approach

Transfer matrix in one dimensional problems

Characterization of Equilibrium Paths in a Two-Sector Economy with CES Production Functions and Sector-Specific Externality

Full-order observers for linear systems with unknown inputs

Axiom of infinity and construction of N

3D-CE5.h: Merge candidate list for disparity compensated prediction

Fast Computation of Moore-Penrose Inverse Matrices

Soundness of the System of Semantic Trees for Classical Logic based on Fitting and Smullyan

Capillary rise between closely spaced plates : effect of Van der Waals forces

The core of voting games: a partition approach

b-chromatic number of cacti

A remark on a theorem of A. E. Ingham.

Thermodynamic form of the equation of motion for perfect fluids of grade n

Nel s category theory based differential and integral Calculus, or Did Newton know category theory?

Paths with two blocks in n-chromatic digraphs

Sound intensity as a function of sound insulation partition

On the Griesmer bound for nonlinear codes

Norm Inequalities of Positive Semi-Definite Matrices

Accurate critical exponents from the ϵ-expansion

Dorian Mazauric. To cite this version: HAL Id: tel

A NON - CONVENTIONAL TYPE OF PERMANENT MAGNET BEARING

The FLRW cosmological model revisited: relation of the local time with th e local curvature and consequences on the Heisenberg uncertainty principle

Kernel Generalized Canonical Correlation Analysis

Beat phenomenon at the arrival of a guided mode in a semi-infinite acoustic duct

Proposition of a general yield function in geomechanics

A Slice Based 3-D Schur-Cohn Stability Criterion

A Study of the Regular Pentagon with a Classic Geometric Approach

Water Vapour Effects in Mass Measurement

Solving an integrated Job-Shop problem with human resource constraints

Gaia astrometric accuracy in the past

The magnetic field diffusion equation including dynamic, hysteresis: A linear formulation of the problem

A non-commutative algorithm for multiplying (7 7) matrices using 250 multiplications

Non Linear Observation Equation For Motion Estimation

Differential approximation results for the Steiner tree problem

Finding good 2-partitions of digraphs II. Enumerable properties

Parallel Repetition of entangled games on the uniform distribution

The beam-gas method for luminosity measurement at LHCb

On constraint qualifications with generalized convexity and optimality conditions

A Simple Proof of P versus NP

Transcription:

MGDA II: A direct method for calculating a descent direction common to several criteria Jean-Antoine Désidéri To cite this version: Jean-Antoine Désidéri. MGDA II: A direct method for calculating a descent direction common to several criteria. [Research Report] RR-7922, INRIA. 2012, pp.11. <hal-00685762> HAL Id: hal-00685762 https://hal.inria.fr/hal-00685762 Submitted on 5 Apr 2012 HAL is a multi-disciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.

MGDA II: A direct method for calculating a descent direction common to several criteria Jean-Antoine Désidéri RESEARCH REPORT N 7922 April 2012 Project-Team Opale ISSN 0249-6399 ISRN INRIA/RR--7922--FR+ENG

MGDA II: A direct method for calculating a descent direction common to several criteria Jean-Antoine Désidéri Project-Team Opale Research Report n 7922 April 2012 8 pages Abstract: This report is a sequel of the publications [1] [3] [2]. We consider the multiobjective optimization problem of the simultaneous minimization of n (n 2) criteria,{j i (Y)} (,...,n), assumed to be smooth real-valued functions of the design vector Y Ω R N (n N) whereωis the (open) admissible domain ofr N over which these functions admit gradients. Given a design point Y 0 Ω that is not Pareto-stationary, we introduce the gradients{j i } (,...,n) at Y=Y 0, and assume them to be linearly independent. We also consider the possible scaling factors,{s i } (,...,n) (S i > 0, i), as specified appropriate normalization constants for the gradients. Then we show that the Gram-Schmidt orthogonalization process, if conducted with a particular calibration of the normalization, yields a new set of orthogonal vectors{u i } (,..,n) spanning the same subspace as the original gradients; additionally, the minimum-norm element of the convex hull corresponding to this new family, ω, is calculated explicitly, and the Fréchet derivatives of the criteria in the direction of ω are all equal and positive. This direct process simplifies the implementation of the previously-defined Multiple-Gradient Descent Algorithm (MGDA). Key-words: multiobjective optimization, descent direction, convex hull, Gram-Schmidt orthogonalization process INRIA Research Director, Opale Project-Team Head RESEARCH CENTRE SOPHIA ANTIPOLIS MÉDITERRANÉE 2004 route des Lucioles - BP 93 06902 Sophia Antipolis Cedex

MGDA II: Une méthode directe de calcul de direction de descente de plusieurs critères Résumé : Ce rapport est une suite des publications [1] [3] [2]. On considère le problème d optimisation multiobjectif dans lequel on cherche à minimiser n (n 2) critères,{j i (Y)} (,...,n), supposés fonctions régulières d un vecteur de conception Y Ω R N (n N) oùωest le domaine (ouvert) admissible, partie der N dans laquelle les critères admettent des gradients. Étant donné un point de conception Y 0 Ω qui n est pas Pareto-stationnaire, on introduit les gradients{j i } (,...,n) en Y = Y 0, et on les suppose linéairement indépendants. On considère également un ensemble de facteurs d échelles,{s i } (,...,n) (S i > 0, i), spécifiés par l utilisateur, et considérés comme des constantes appropriées de normalisation des gradients. On montre alors que le processus d orthogonalisation de Gram-Schmidt, lorsqu on le conduit avec une calibration bien spécifique de la normalisation, produit un ensemble de vecteurs orthogonaux{u i } (,..,n) qui engendrent le même sous-espace que les gradients d origine; de plus, l élément de plus norme de l enveloppe convexe de cette nouvelle famille, ω, se calcule explicitement, et les dérivées de Fréchet des critères dans la direction de ω sont égales et positives. Ce processus direct simplifie la mise en œuvre de l Algorithme de Descente à Gradients Multiples (MGDA) défini précédemment. Mots-clés : optimisation multiobjectif, direction de descente, enveloppe convexe, processus d orthogonalisation de Gram-Schmidt

MGDA II 3 1 Introduction We consider the context of the simultaneous minimization of n (n 2) criteria,{j i (Y)} (,...,n), assumed to be smooth real-valued functions of the design vector Y Ω R N (n N) whereω is the (open) admissible domain ofr N over which these functions admit gradients. Let Y 0 Ω, and let: J i= J i (Y 0 ) (,..., n; J i R N ) (1) be the gradients at the design-point Y 0. In [1] and [2], we have introduced the local notion of Pareto-stationarity, defined as the existence of a convex combination of the gradients that is equal to zero. We established there that Pareto-stationarity was a necessary condition to Pareto-optimality. If inversely, the Paretostationarity condition is not satisfied, vectors having positive scalar products with all the gradients {J i } (,...,n) exist. We focus on the question of identifying such vectors. Consider a family{u i } (,...,n) of n vectors ofr N, and recall the definition of their convex hull: n n U= u RN / u= α i u i ;α i 0 ( i) ; α i = 1 (2) This set is closed and convex. Hence it admits a unique elementωof minimum norm. We established that: ( ) u U : u,ω ω 2 (3) By letting u i = J i (,..., n) (4) and identifying the corresponding vectorω, we were able to conclude that eitherω=0 and the design-point Y 0 is Pareto-stationary, or ω is a descent direction common to all criteria. This observation has led us to propose the Multiple-Gradient Descent Algorithm (MGDA) that is an iteration generalizing the classical steepest-descent method to the context of multiobjective optimization. At a given iteration, the design point Y 0 is updated by a step in the direction opposite toω: δy 0 = ρω (5) Assuming the stepsize ρ is optimized, MGDA converges to a Pareto-stationary design point [1] [2]. Thus, MGDA provides a technique to identify Pareto sets when gradients are available, as demonstrated in [3]. We have also shown that the convex hull was isomorphic to the positive part of the sphere ofr n 1 (independently of N) and it can be parameterized by n 1 spherical coordinates. Hence, in the first version of our method, when n>2, we proposed to identify the vectorωby actually finding numerically the minimum of u 2 in U by optimizing the spherical coordinates, or more precisely, their cosines squared that are n 1 independent parameters varying in the interval [0,1]. This minimization can be conducted trivially when n is small, but can become difficult for large n. In this new report, we propose a variant of MGDA in which the directionωis found by a direct process, assuming the family of gradients{j i } (,...,n) is linearly independent. 2 Direct calculation of the minimum-norm element In [2], the following remark was made: RR n 7922

4 Jean-Antoine Désidéri Remark 1 If the gradients are not normalized, the direction of the minimum-norm elementωis expected to be mostly influenced by the gradients of small norms in the family, as the case n=2 illustrated in Figure 1 suggests. In the course of the iterative optimization, these vectors are often associated with the criteria that have already achieved a fair degree of convergence. If this direction may yield a very direct path to the Pareto front, one may question whether it is adequate for a well-balanced multiobjective iteration. Some on-going research is focused on analyzing various normalization procedures to circumvent this undesirable trend. In these alternatives, the gradient J is replaced i by one of the following formulas: ( J i, J i J i (Y 0 ), J i (Y 0 ) max J (k 1) (Y 0 ) J (k) (Y 0 ),δ ) i i, or J 2 i (6) J i J 2 J i i (k: iteration number; δ > 0, small). The first formula is a standard normalization: it has the merit of providing a stable definition; the second realizes equal logorithmic first variations of the criteria whenever ω belongs to the interior U of the convex hull since then, the Fréchet derivatives ( ui,ω ) are equal; the last two are inspired from Newton s method (assuming lim J i = 0 for the first). This question is still open. J i u 1 ω=ω u 2 u 1 ω u 1 =ω u 2 ω=ω u 2 Figure 1: Case n=2: possible positions of vectorωwith respect to the two gradients u 1 and u 2 Thus, accounting for the above remark, we consider a given set of strictly-positive scaling factors {S i } (,...,n), intended to normalize the gradients appropriately. We apply the Gram- Schmidt orthogonalization process to the gradients with the following special calibration of the normalization: where A 1 = S 1, and, for i=2, 3,..., n: u 1 = J 1 A 1 (7) u i = J i k<i c i,k u k A i (8) where: k<i : c i,k = ( J i, u ) k ( ) (9) uk, u k and S i A i = k<i ε i S i for some arbitrary, but smallε i (0< ε i 1). c i,k if nonzero otherwise (10) Then : In general, the vectors{u i } (,...,n) are not of norm unity. But they are orthogonal, and this property Inria

MGDA II 5 makes the calculation of the minimum-norm element of the convex hull, ω, direct. For this, note that: n ω= α i u i (11) and ω 2 = n α 2 i u i 2. (12) To determine the coefficients{α i } (,...,n), anticipating thatωbelongs to the interior of the convex hull, the inequality constraints are ignored, and the following Lagrangian is made stationary: n n n L(α,λ)= ω 2 λ α i 1 = α 2 i u i 2 λ α i 1 (13) This gives: L α i = 2 u i 2 α i λ= α i = The equality constraint, n α i= 1, then gives: and finally: α i = λ 2 = 1 1 u i 2 = n 1 j=1 u j 2 λ 2 u i 2 (14) n (15) 1 u i 2 1 1+ j i which confirms thatωdoes belong to the interior of the convex hull, so that: < 1 (16) u i 2 u j 2 i : α i u i 2 = λ 2 (a constant), (17) and: k : ( uk,ω ) =α k u k 2 = λ 2. (18) Now: ( J i,ω) = ( A i u i + c i,k u k,ω)= A λ i+ c i,k 2 = S λ i 2, or S i(1+ε i ) λ 2 k<i k<i ( i). (19) Lastly, convening thatε i = 0 in the regular case (S i k<i c i,k ), and otherwise by modifying slightly the definition of the scaling factor according to the following holds: S i= (1+ε i )S i, (20) ( S 1 i J i,ω) = λ 2 ( i) (21) that is, the same positive constant. RR n 7922

6 Jean-Antoine Désidéri 3 Conclusion We have considered the multiobjective optimization problem of the simultaneous minimization of n (n 2) criteria,{j i (Y)} (,...,n), assumed to be smooth real-valued functions of the design vector Y Ω R N (n N) whereωis the (open) admissible domain ofr N over which these functions admit gradients. Given a design point Y 0 Ω that is not Pareto-stationary, we have introduced the gradients{j i } (,...,n) at Y=Y 0, and assumed them to be linearly independent. We have also considered the possible scaling factors,{s i } (,...,n) (S i > 0, i), as specified appropriate normalization constants for the gradients. Then we have shown that the Gram- Schmidt orthogonalization process, if conducted with a particular calibration of the normalization yields a new set of orthogonal vectors{u i } (,..,n) spanning the same subspace as the original gradients; additionally, the minimum-norm element of the convex hull corresponding to this new family, ω, is calculated explicitly, and the Fréchet derivatives of the criteria in the direction of ω are all equal and positive. This new result has led us to reformulate the definition of the MGDA, in which the descent direction common to all criteria is now calculated by a direct process. Finally, we make the following two remarks: Remark 2 In the exception case where k<i c i,k = S i, for some i, the corresponding vector u i has a large norm. Consequently, this vector has a weak influence on the definition ofω. Remark 3 As the Pareto set is approached, the family of gradients becomes closer to linear dependence, as the Pareto-stationarity condition requires. At this stage, this variant may experience numerical difficulties, making the more robust former approach preferable. Inria

MGDA II 7 References [1] Jean-Antoine Désidéri. Multiple-Gradient Descent Algorithm (MGDA). Research Report RR-6953, INRIA, June 2009. [2] Jean-Antoine Désidéri. Multiple-gradient descent algorithm (mgda) for multiobjective optimization. Comptes rendus - Mathématique, 1(4867), 2012. DOI: 10.1016/j.crma.2012.03.014. [3] Adrien Zerbinati, Jean-Antoine Desideri, and Régis Duvigneau. Comparison between MGDA and PAES for Multi-Objective Optimization. Research Report RR-7667, INRIA, June 2011. RR n 7922

8 Jean-Antoine Désidéri Contents 1 Introduction 3 2 Direct calculation of the minimum-norm element 3 3 Conclusion 6 Inria

RESEARCH CENTRE SOPHIA ANTIPOLIS MÉDITERRANÉE 2004 route des Lucioles - BP 93 06902 Sophia Antipolis Cedex Publisher Inria Domaine de Voluceau - Rocquencourt BP 105-78153 Le Chesnay Cedex inria.fr ISSN 0249-6399