On construction of constrained optimum designs

Similar documents
Heteroscedastic T-Optimum Designs for Multiresponse Dynamic Models

Efficient algorithms for calculating optimal designs in pharmacokinetics and dose finding studies

An Adaptive Partition-based Approach for Solving Two-stage Stochastic Programs with Fixed Recourse

arxiv: v1 [stat.co] 7 Sep 2017

D-optimal Designs for Factorial Experiments under Generalized Linear Models

ADAPTIVE EXPERIMENTAL DESIGNS. Maciej Patan and Barbara Bogacka. University of Zielona Góra, Poland and Queen Mary, University of London

A new algorithm for deriving optimal designs

PROBABILITY AND STATISTICS Vol. III - Statistical Experiments and Optimal Design - Andrej Pázman STATISTICAL EXPERIMENTS AND OPTIMAL DESIGN

Network Flows. 6. Lagrangian Relaxation. Programming. Fall 2010 Instructor: Dr. Masoud Yaghini

Optimal sensor location for distributed parameter system identi

Sum-Power Iterative Watefilling Algorithm

Benders Decomposition

Optimum Designs for the Equality of Parameters in Enzyme Inhibition Kinetic Models

Linear and Integer Programming - ideas

Introduction to Integer Linear Programming

Approximate Optimal Designs for Multivariate Polynomial Regression

3.7 Cutting plane methods

OPTIMAL SENSOR PLACEMENT FOR JOINT PARAMETER AND STATE ESTIMATION PROBLEMS IN LARGE-SCALE DYNAMICAL SYSTEMS WITH APPLICATIONS TO THERMO-MECHANICS

Integer Programming ISE 418. Lecture 8. Dr. Ted Ralphs

Optimal experimental design, an introduction, Jesús López Fidalgo

Integer programming: an introduction. Alessandro Astolfi

Mustafa H. Tongarlak Bruce E. Ankenman Barry L. Nelson

Support Vector Machines

minimize x subject to (x 2)(x 4) u,

Section Notes 9. IP: Cutting Planes. Applied Math 121. Week of April 12, 2010

Transductive Experiment Design

Multiplicative Algorithm for computing Optimum Designs

On fast trust region methods for quadratic models with linear constraints. M.J.D. Powell

Part 4. Decomposition Algorithms

Primal/Dual Decomposition Methods

VARIATIONAL CALCULUS IN SPACE OF MEASURES AND OPTIMAL DESIGN

3.10 Lagrangian relaxation

Computational Finance

A Parametric Simplex Algorithm for Linear Vector Optimization Problems

Designs for Generalized Linear Models

Information in a Two-Stage Adaptive Optimal Design

maxz = 3x 1 +4x 2 2x 1 +x 2 6 2x 1 +3x 2 9 x 1,x 2

Optimal sensor placement in parameter estimation of distributed systems

Optimization. Charles J. Geyer School of Statistics University of Minnesota. Stat 8054 Lecture Notes

Scenario Grouping and Decomposition Algorithms for Chance-constrained Programs

Review of Optimization Methods

Network Localization via Schatten Quasi-Norm Minimization

Functional SVD for Big Data

Column Generation. MTech Seminar Report. Soumitra Pal Roll No: under the guidance of

OPTIMAL DESIGNS FOR GENERALIZED LINEAR MODELS WITH MULTIPLE DESIGN VARIABLES

Optimal Control. Macroeconomics II SMU. Ömer Özak (SMU) Economic Growth Macroeconomics II 1 / 112

The L-Shaped Method. Operations Research. Anthony Papavasiliou 1 / 44

min3x 1 + 4x 2 + 5x 3 2x 1 + 2x 2 + x 3 6 x 1 + 2x 2 + 3x 3 5 x 1, x 2, x 3 0.

Written Examination

Hot-Starting NLP Solvers

An Active Set Strategy for Solving Optimization Problems with up to 200,000,000 Nonlinear Constraints

Disconnecting Networks via Node Deletions

Construction of Permutation Mixture Experiment Designs. Ben Torsney and Yousif Jaha University of Glasgow

EE 381V: Large Scale Optimization Fall Lecture 24 April 11

Response Surface Methods

HW1 solutions. 1. α Ef(x) β, where Ef(x) is the expected value of f(x), i.e., Ef(x) = n. i=1 p if(a i ). (The function f : R R is given.

Gestion de la production. Book: Linear Programming, Vasek Chvatal, McGill University, W.H. Freeman and Company, New York, USA

Ω R n is called the constraint set or feasible set. x 1

Accelerated Block-Coordinate Relaxation for Regularized Optimization

Lecture 3 September 1

EE/ACM Applications of Convex Optimization in Signal Processing and Communications Lecture 17

An Integer Cutting-Plane Procedure for the Dantzig-Wolfe Decomposition: Theory

ORF 522. Linear Programming and Convex Analysis

A strongly polynomial algorithm for linear systems having a binary solution

ON D-OPTIMAL DESIGNS FOR ESTIMATING SLOPE

CHAPTER 2: QUADRATIC PROGRAMMING

MIT Manufacturing Systems Analysis Lecture 14-16

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1

Optimization and Root Finding. Kurt Hornik

Chapter 2 Distributed Parameter Systems: Controllability, Observability, and Identification

Convex Optimization of Graph Laplacian Eigenvalues

OPTIMAL DESIGNS FOR 2 k FACTORIAL EXPERIMENTS WITH BINARY RESPONSE

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Numerical Methods for Large-Scale Nonlinear Systems

Contraction Methods for Convex Optimization and Monotone Variational Inequalities No.16

1 Principal component analysis and dimensional reduction

Stochastic Design Criteria in Linear Models

MATHEMATICS FOR COMPUTER VISION WEEK 8 OPTIMISATION PART 2. Dr Fabio Cuzzolin MSc in Computer Vision Oxford Brookes University Year

Linear Programming Redux

Fundamental Theorems of Optimization

D-optimal Designs for Multinomial Logistic Models

c 2004 Society for Industrial and Applied Mathematics

Lecture 1: Basic Concepts

Optimal sensor placement based on model order reduction

OPTIMAL DESIGNS FOR 2 k FACTORIAL EXPERIMENTS WITH BINARY RESPONSE

Linear Programming Methods

Kaisa Joki Adil M. Bagirov Napsu Karmitsa Marko M. Mäkelä. New Proximal Bundle Method for Nonsmooth DC Optimization

Applications of Linear Programming

Inverse Kinematics. Mike Bailey. Oregon State University. Inverse Kinematics

Linear Programming: Simplex

FALL 2018 MATH 4211/6211 Optimization Homework 4

Uses of duality. Geoff Gordon & Ryan Tibshirani Optimization /

Homework 4. Convex Optimization /36-725

OPTIMAL ESTIMATION of DYNAMIC SYSTEMS

Lecture: Examples of LP, SOCP and SDP

Active Robot Calibration Algorithm

Chapter 5 Linear Programming (LP)

Sparse Covariance Selection using Semidefinite Programming

Introduction to Integer Programming

Barrier Method. Javier Peña Convex Optimization /36-725

Transcription:

On construction of constrained optimum designs Institute of Control and Computation Engineering University of Zielona Góra, Poland DEMA2008, Cambridge, 15 August 2008

Numerical algorithms to construct optimal designs 1 Sequential algorithms with selection of support points: Wynn-Fedorov scheme (Atkinson, Donev and Tobias, 2007; Fedorov and Hackl, 1997; Walter and Pronzato, 1997; Pázman, 1986; Silvey, 1980). 2 Sequential numerical design algorithms with support points given a priori: multiplicative scheme (Torsney, 1988; Silvey, Titterington and Torsney, 1978, Torsney and Mandal, 2001; 2004; Pázman, 1986), linear matrix inequalities (Boyd and Vandenberghe, 2004). In practice, various inequality constraints must be sometimes considered due to cost limitations, required restrictions for achieving certain robustness properties, or restrictions on the experimental space. Although much work has been done in theory (Fedorov and Hackl, 1997; Cook and Fedorov, 1995), publications on the algorithmic aspects of constrained optimization are still scarce.

Numerical algorithms to construct optimal designs 1 Sequential algorithms with selection of support points: Wynn-Fedorov scheme (Atkinson, Donev and Tobias, 2007; Fedorov and Hackl, 1997; Walter and Pronzato, 1997; Pázman, 1986; Silvey, 1980). 2 Sequential numerical design algorithms with support points given a priori: multiplicative scheme (Torsney, 1988; Silvey, Titterington and Torsney, 1978, Torsney and Mandal, 2001; 2004; Pázman, 1986), linear matrix inequalities (Boyd and Vandenberghe, 2004). In practice, various inequality constraints must be sometimes considered due to cost limitations, required restrictions for achieving certain robustness properties, or restrictions on the experimental space. Although much work has been done in theory (Fedorov and Hackl, 1997; Cook and Fedorov, 1995), publications on the algorithmic aspects of constrained optimization are still scarce.

Numerical algorithms to construct optimal designs 1 Sequential algorithms with selection of support points: Wynn-Fedorov scheme (Atkinson, Donev and Tobias, 2007; Fedorov and Hackl, 1997; Walter and Pronzato, 1997; Pázman, 1986; Silvey, 1980). 2 Sequential numerical design algorithms with support points given a priori: multiplicative scheme (Torsney, 1988; Silvey, Titterington and Torsney, 1978, Torsney and Mandal, 2001; 2004; Pázman, 1986), linear matrix inequalities (Boyd and Vandenberghe, 2004). In practice, various inequality constraints must be sometimes considered due to cost limitations, required restrictions for achieving certain robustness properties, or restrictions on the experimental space. Although much work has been done in theory (Fedorov and Hackl, 1997; Cook and Fedorov, 1995), publications on the algorithmic aspects of constrained optimization are still scarce.

Numerical algorithms to construct optimal designs 1 Sequential algorithms with selection of support points: Wynn-Fedorov scheme (Atkinson, Donev and Tobias, 2007; Fedorov and Hackl, 1997; Walter and Pronzato, 1997; Pázman, 1986; Silvey, 1980). 2 Sequential numerical design algorithms with support points given a priori: multiplicative scheme (Torsney, 1988; Silvey, Titterington and Torsney, 1978, Torsney and Mandal, 2001; 2004; Pázman, 1986), linear matrix inequalities (Boyd and Vandenberghe, 2004). In practice, various inequality constraints must be sometimes considered due to cost limitations, required restrictions for achieving certain robustness properties, or restrictions on the experimental space. Although much work has been done in theory (Fedorov and Hackl, 1997; Cook and Fedorov, 1995), publications on the algorithmic aspects of constrained optimization are still scarce.

Numerical algorithms to construct optimal designs 1 Sequential algorithms with selection of support points: Wynn-Fedorov scheme (Atkinson, Donev and Tobias, 2007; Fedorov and Hackl, 1997; Walter and Pronzato, 1997; Pázman, 1986; Silvey, 1980). 2 Sequential numerical design algorithms with support points given a priori: multiplicative scheme (Torsney, 1988; Silvey, Titterington and Torsney, 1978, Torsney and Mandal, 2001; 2004; Pázman, 1986), linear matrix inequalities (Boyd and Vandenberghe, 2004). In practice, various inequality constraints must be sometimes considered due to cost limitations, required restrictions for achieving certain robustness properties, or restrictions on the experimental space. Although much work has been done in theory (Fedorov and Hackl, 1997; Cook and Fedorov, 1995), publications on the algorithmic aspects of constrained optimization are still scarce.

Classical framework Multiresponse parametric model y ij = η(x i, θ) + ε ij, { j = 1,..., ri i = 1,..., n Notation: y ij observations of response variables x i fixed values of explanatory (or independent) variables (e.g., time, temperature, spatial location, drug doses, etc.) r i > 1 no. of replications for setting x i, N = n i=1 r i η(, ) known regression function θ vector of constant but unknown parameters

Classical framework Additive random errors Notation: E(ε ij ) = 0 E(ε ij ε T kl ) = δ ijδ kl V (x i ) V (x i ) 0 dispersion matrices are (known, possibly up to a common constant multiplier) δ ij the Kronecker delta

Simplification for linear models Linear regression η(x i, θ) = F (x i ) T θ Notation: F (x i ) known matrices BLUE of θ n θ = M 1 r i F (x i )V (x i ) 1 ȳ i, i=1 Notation: ȳ i = 1 r i ri j=1 y ij M i = F (x i )V (x i ) 1 F (x i ) T, i = 1,..., n. M = n i=1 r i M i Fisher information matrix

Simplification for linear models Linear regression η(x i, θ) = F (x i ) T θ Notation: F (x i ) known matrices BLUE of θ n θ = M 1 r i F (x i )V (x i ) 1 ȳ i, i=1 Notation: ȳ i = 1 r i ri j=1 y ij M i = F (x i )V (x i ) 1 F (x i ) T, i = 1,..., n. M = n i=1 r i M i Fisher information matrix

Linear models (ctd ) Covariance matrix of θ cov( θ) = M 1 We assume that the values of x i, i = 1,..., n are fixed and may not be altered, but we have full control over the corresponding numbers of replications r i, i = 1,..., n. We wish to choose them in an optimal way to enhance the process of estimating θ.

Linear models (ctd ) Covariance matrix of θ cov( θ) = M 1 We assume that the values of x i, i = 1,..., n are fixed and may not be altered, but we have full control over the corresponding numbers of replications r i, i = 1,..., n. We wish to choose them in an optimal way to enhance the process of estimating θ.

Convenient formulation Discrete design { } x ξ = 1,..., x n p 1,..., p n Notation: x i support points p i = r i /N weights P.m.f. property of weights 1 T p = 1, p 0 Notation: 1 = (1, 1,..., 1)

Optimality criterion Normalized FIM D-optimality criterion M(p) = 1 n N M = p i M i i=1 Φ[ M(p)] = log det( M(p)) max Further, for simplicity of notation, the tilde over M( ) will be dropped.

Optimality criterion Normalized FIM D-optimality criterion M(p) = 1 n N M = p i M i i=1 Φ[ M(p)] = log det( M(p)) max Further, for simplicity of notation, the tilde over M( ) will be dropped.

Problems involved Problem 1. The resulting optimization problem constitutes a classical discrete resource allocation problem. Its combinatorial nature excludes calculus techniques and implies prohibitive computational complexity. Way round: Relaxation Feasible weights p i s are considered as any real numbers in the interval [0, 1] which sum up to unity, and not necessarily integer multiples of 1/N. Advantage A simple and efficient multiplicative algorithm can be exploited (cf. the previous talk by Ben Torsney).

Problems involved Problem 1. The resulting optimization problem constitutes a classical discrete resource allocation problem. Its combinatorial nature excludes calculus techniques and implies prohibitive computational complexity. Way round: Relaxation Feasible weights p i s are considered as any real numbers in the interval [0, 1] which sum up to unity, and not necessarily integer multiples of 1/N. Advantage A simple and efficient multiplicative algorithm can be exploited (cf. the previous talk by Ben Torsney).

Problems involved Problem 2. The produced designs concentrate at a relatively small number of support points (close to the number of the estimated parameters), rather than spreading the measurement effort around appropriately, which many practicing statisticians tend to do. Solution Prevent spending the overall experimental effort at few points by directly bounding the frequencies of observations from above: p b where b 1 is fixed.

Problems involved Problem 2. The produced designs concentrate at a relatively small number of support points (close to the number of the estimated parameters), rather than spreading the measurement effort around appropriately, which many practicing statisticians tend to do. Solution Prevent spending the overall experimental effort at few points by directly bounding the frequencies of observations from above: p b where b 1 is fixed.

Problem statement once again Ultimate formulation Given a vector b 0 satisfying 1 T b 1 find a vector of weights p = (p 1,..., p n ) to maximize Φ[M(p)] = log det ( M(p) ) subject to 0 p b 1 T p = 1

Properties 1 The performance index Φ is concave over the canonical simplex S n = { p 0 1 T p = 1 } 2 It is differentiable at points yielding nonsingular FIMs, with φ(p) := Φ(p) = [tr { M(p) 1 } { M 1,..., tr M(p) 1 } ] T M n 3 The constraint set P is a rather nice convex set (e.g., fast algorithms of orthogonal projection on P exist). Numerous computational methods can potentially be employed, e.g., the conditional gradient method or a gradient projection method. But, if the number of the support points is large, they may lead to unsatisfactory long computational times.

Properties 1 The performance index Φ is concave over the canonical simplex S n = { p 0 1 T p = 1 } 2 It is differentiable at points yielding nonsingular FIMs, with φ(p) := Φ(p) = [tr { M(p) 1 } { M 1,..., tr M(p) 1 } ] T M n 3 The constraint set P is a rather nice convex set (e.g., fast algorithms of orthogonal projection on P exist). Numerous computational methods can potentially be employed, e.g., the conditional gradient method or a gradient projection method. But, if the number of the support points is large, they may lead to unsatisfactory long computational times.

Properties 1 The performance index Φ is concave over the canonical simplex S n = { p 0 1 T p = 1 } 2 It is differentiable at points yielding nonsingular FIMs, with φ(p) := Φ(p) = [tr { M(p) 1 } { M 1,..., tr M(p) 1 } ] T M n 3 The constraint set P is a rather nice convex set (e.g., fast algorithms of orthogonal projection on P exist). Numerous computational methods can potentially be employed, e.g., the conditional gradient method or a gradient projection method. But, if the number of the support points is large, they may lead to unsatisfactory long computational times.

Characterization of the optimal design Proposition 1 Suppose that the matrix M(p ) is nonsingular for some p P. The vector p constitutes a global maximum of Φ over P if, and only if, there exists a number λ such that for i = 1,..., n. λ if p φ i (p i = b i ) = λ if 0 < pi < b i λ if pi = 0

Simplicial decomposition Simplicial Decomposition (SD) stands for a class of methods for solving large-scale continuous problems in mathematical programming with convex feasible sets (von Hohenbalken, 1977). It iterates by alternately solving 1 a linear programming subproblem (the so-called column generation problem) which generates an extreme point of the polyhedron, and 2 a nonlinear restricted master problem (RMP) which finds the maximum of the objective function over the convex hull (a simplex) of previously defined extreme points. Its principal characteristic is that the sequence of successive solutions to the master problem tends to a solution to the original problem in such a way that the objective function strictly monotonically approaches its optimal value.

Simplicial decomposition Simplicial Decomposition (SD) stands for a class of methods for solving large-scale continuous problems in mathematical programming with convex feasible sets (von Hohenbalken, 1977). It iterates by alternately solving 1 a linear programming subproblem (the so-called column generation problem) which generates an extreme point of the polyhedron, and 2 a nonlinear restricted master problem (RMP) which finds the maximum of the objective function over the convex hull (a simplex) of previously defined extreme points. Its principal characteristic is that the sequence of successive solutions to the master problem tends to a solution to the original problem in such a way that the objective function strictly monotonically approaches its optimal value.

Simplicial decomposition Simplicial Decomposition (SD) stands for a class of methods for solving large-scale continuous problems in mathematical programming with convex feasible sets (von Hohenbalken, 1977). It iterates by alternately solving 1 a linear programming subproblem (the so-called column generation problem) which generates an extreme point of the polyhedron, and 2 a nonlinear restricted master problem (RMP) which finds the maximum of the objective function over the convex hull (a simplex) of previously defined extreme points. Its principal characteristic is that the sequence of successive solutions to the master problem tends to a solution to the original problem in such a way that the objective function strictly monotonically approaches its optimal value.

Simplicial decomposition Simplicial Decomposition (SD) stands for a class of methods for solving large-scale continuous problems in mathematical programming with convex feasible sets (von Hohenbalken, 1977). It iterates by alternately solving 1 a linear programming subproblem (the so-called column generation problem) which generates an extreme point of the polyhedron, and 2 a nonlinear restricted master problem (RMP) which finds the maximum of the objective function over the convex hull (a simplex) of previously defined extreme points. Its principal characteristic is that the sequence of successive solutions to the master problem tends to a solution to the original problem in such a way that the objective function strictly monotonically approaches its optimal value.

Simplicial decomposition 3 p2 2 1 K3 K2 K1 0 1 2 3 p1 K1 K2 K3

Simplicial decomposition 3 p2 2 1 K3 K2 K1 0 1 2 3 p1 K1 K2 K3

Simplicial decomposition 3 p2 2 1 K3 K2 K1 0 1 2 3 p1 K1 K2 K3

Simplicial decomposition 3 p2 2 1 K3 K2 K1 0 1 2 3 p1 K1 K2 K3

Simplicial decomposition 3 p2 2 1 K3 K2 K1 0 1 2 3 p1 K1 K2 K3

Simplicial decomposition 3 p2 2 1 K3 K2 K1 0 1 2 3 p1 K1 K2 K3

Simplicial decomposition 3 p2 2 1 K3 K2 K1 0 1 2 3 p1 K1 K2 K3

Simplicial decomposition 3 p2 2 1 K3 K2 K1 0 1 2 3 p1 K1 K2 K3

Simplicial decomposition 3 p2 2 1 K3 K2 K1 0 1 2 3 p1 K1 K2 K3

Simplicial decomposition 3 p2 2 1 K3 K2 K1 0 1 2 3 p1 K1 K2 K3

Simplicial decomposition 3 p2 2 1 K3 K2 K1 0 1 2 3 p1 K1 K2 K3

Algorithm SD Step 0: (Initialization) Guess an initial solution p (0) P such that M(p (0) ) is nonsingular. Set I = { 1,..., n }, Q (0) = { p (0)} and k = 0. Step 1: (Termination check) Set If I (k) ub I (k) im I (k) lb = { i I p (k) i = b i }, = { i I 0 < p (k) i < b i }, = { i I p (k) i = 0 }. λ if i I (k) ub, φ i (p (k) ) = λ if i I (k) im, λ if i I (k) lb for some λ R +, then STOP and p (k) is optimal.

Algorithm SD Step 0: (Initialization) Guess an initial solution p (0) P such that M(p (0) ) is nonsingular. Set I = { 1,..., n }, Q (0) = { p (0)} and k = 0. Step 1: (Termination check) Set If I (k) ub I (k) im I (k) lb = { i I p (k) i = b i }, = { i I 0 < p (k) i < b i }, = { i I p (k) i = 0 }. λ if i I (k) ub, φ i (p (k) ) = λ if i I (k) im, λ if i I (k) lb for some λ R +, then STOP and p (k) is optimal.

Step 2: (Solution of the column generation subproblem) Compute q (k+1) = arg max p P φ(p(k) ) T p and set Q (k+1) = Q (k) { q (k+1)}. Step 3: (Solution of the restricted master subproblem) Find p (k+1) = arg max M(p) p co(q (k+1) ) and purge Q (k+1) of all extreme points with zero weights in the resulting expression of p (k+1) as a convex combination of elements in Q (k+1). Increment k by one and go back to Step 1.

Step 2: (Solution of the column generation subproblem) Compute q (k+1) = arg max p P φ(p(k) ) T p and set Q (k+1) = Q (k) { q (k+1)}. Step 3: (Solution of the restricted master subproblem) Find p (k+1) = arg max M(p) p co(q (k+1) ) and purge Q (k+1) of all extreme points with zero weights in the resulting expression of p (k+1) as a convex combination of elements in Q (k+1). Increment k by one and go back to Step 1.

Column generation problem Basically, it is a linear programming problem: maximize subject to c T p p P where c = φ(p (k) ). A vector q P constitutes its global solution if, and only if, there exists a scalar ρ such that for i = 1,..., n. ρ if q i = b i c i = ρ if 0 < q i < b i ρ if q i = 0

Column generation problem Basically, it is a linear programming problem: maximize subject to c T p p P where c = φ(p (k) ). A vector q P constitutes its global solution if, and only if, there exists a scalar ρ such that for i = 1,..., n. ρ if q i = b i c i = ρ if 0 < q i < b i ρ if q i = 0

Solution of the column generation problem Step 0: (Initialization) Set j = 0 and v (0) = 0. Step 1: (Sorting) Sort the elements of c in nonincreasing order, i.e., find a permutation π on the index set I = { 1,..., n } such that c π(i) c π(i+1), i = 1,..., n 1 Step 2: (Identification of nonzero weights) Step 2.1: If v (j) + b π(j+1) < 1 then set v (j+1) = v (j) + b π(j+1). Otherwise, go to Step 3. Step 2.2: Increment j by one and go to Step 2.1.

Solution of the column generation problem Step 0: (Initialization) Set j = 0 and v (0) = 0. Step 1: (Sorting) Sort the elements of c in nonincreasing order, i.e., find a permutation π on the index set I = { 1,..., n } such that c π(i) c π(i+1), i = 1,..., n 1 Step 2: (Identification of nonzero weights) Step 2.1: If v (j) + b π(j+1) < 1 then set v (j+1) = v (j) + b π(j+1). Otherwise, go to Step 3. Step 2.2: Increment j by one and go to Step 2.1.

Solution of the column generation problem Step 0: (Initialization) Set j = 0 and v (0) = 0. Step 1: (Sorting) Sort the elements of c in nonincreasing order, i.e., find a permutation π on the index set I = { 1,..., n } such that c π(i) c π(i+1), i = 1,..., n 1 Step 2: (Identification of nonzero weights) Step 2.1: If v (j) + b π(j+1) < 1 then set v (j+1) = v (j) + b π(j+1). Otherwise, go to Step 3. Step 2.2: Increment j by one and go to Step 2.1.

Solution of the column generation problem Step 3: (Form the ultimate solution) Set b π(i) for i = 1,..., j, q π(i) = 1 v (j) for i = j + 1, 0 for i = j + 2,..., n. The algorithm starts by picking the consecutive largest components c i of c and setting the corresponding weights q i as their maximal allowable values b i. The process is repeated until the sum of the assigned weights exceeds one. Then the value of the last weight which was set in this manner should be corrected so that the weights sum up to one. The remaining (i.e., unassigned) weights are set as zeros.

Solution of the column generation problem Step 3: (Form the ultimate solution) Set b π(i) for i = 1,..., j, q π(i) = 1 v (j) for i = j + 1, 0 for i = j + 2,..., n. The algorithm starts by picking the consecutive largest components c i of c and setting the corresponding weights q i as their maximal allowable values b i. The process is repeated until the sum of the assigned weights exceeds one. Then the value of the last weight which was set in this manner should be corrected so that the weights sum up to one. The remaining (i.e., unassigned) weights are set as zeros.

Solution of the restricted master problem Suppose that in the (k + 1)-th iteration of SD, we have Q (k+1) = { q 1,..., q r } possibly with r < k + 1 (owing to the deletion mechanism of uninformative points). Step 3 of Algorithm SD involves maximization of Φ[M(p)] = log det ( M(p ) over co(q (k+1) ) = p = r w j q j w 0, 1 T w = 1 j=1

Solution of the restricted master problem From the representation of any p co(q (k+1) ) as p = or, in component-wise form, r w j q j j=1 r p i = w j q j,i, j=1 i = 1,..., n q j,i being the i-th component of q j, it follows that n r ( n ) r M(p) = p i M i = w j q j,i M i = w j M(q j ) i=1 j=1 i=1 j=1

Solution of the restricted master problem Equivalent formulation of the RMP Find the sequence of weights w R r to maximize subject to the constraints Notation: H(w) = r j=1 w j H j H j = M(q j ) Ψ(w) = log det ( H(w) ) 1 T w = 1 w 0

Proposition 2 Suppose that the matrix H(w ) is nonsingular for some w S r. The vector w constitutes a global solution to the RMP if and only if for each j = 1,..., r, where { = m if w ψ j (w ) j > 0 m if wj = 0 ψ j (w) = tr [ H(w) 1 H j ], j = 1,..., r

Multiplicative algorithm for the RMP Step 0: (Initialization) Select a weight vector w (0) S r R r ++, e.g., set w (0) = (1/r)1. Set l = 0. Step 1: (Termination check) If 1 m ψ(w (l) ) 1 then STOP. Step 2: (Multiplicative update) Evaluate w (l+1) = 1 m ψ(w (l) ) w (l) Increment l by one and go to Step 1.

Multiplicative algorithm for the RMP Step 0: (Initialization) Select a weight vector w (0) S r R r ++, e.g., set w (0) = (1/r)1. Set l = 0. Step 1: (Termination check) If 1 m ψ(w (l) ) 1 then STOP. Step 2: (Multiplicative update) Evaluate w (l+1) = 1 m ψ(w (l) ) w (l) Increment l by one and go to Step 1.

Multiplicative algorithm for the RMP Step 0: (Initialization) Select a weight vector w (0) S r R r ++, e.g., set w (0) = (1/r)1. Set l = 0. Step 1: (Termination check) If 1 m ψ(w (l) ) 1 then STOP. Step 2: (Multiplicative update) Evaluate w (l+1) = 1 m ψ(w (l) ) w (l) Increment l by one and go to Step 1.

Numerical example Consider a batch reactor initially loaded with an aqueous solution of component A. In the presence of a solid catalyst, this reacts to form components B and C according to the consecutive reaction scheme A B C. The time changes in the concentrations [A], [B] and [C] are governed by d[a] = k 1 [A] γ 1, [A] t=0 = 1 dt d[b] = k 1 [A] γ 1 k 2 [B] γ 2, [B] t=0 = 0 dt d[c] = k 2 [B] γ 2, [C] t=0 = 0 dt where k 1 and k 2 are the rates and γ 1 and γ 2 are the orders of the reactions. Usually, the coefficients k 1, k 2, γ 1 and γ 2 are not known in advance.

Numerical example We set Moreover, x i = t i, i = 1,..., n θ = (k 1, k 2, γ 1, γ 2 ) η(t, θ) = ([A](t; θ), [B](t; θ), [C](t; θ)) θ 0 = (0.7, 0.2, 1.1, 1.5) V (t i ) = I 3, i = 1,..., n F (t i ) T = η θ (t i, θ 0 ), i = 1,..., n Consider n = 100 potential support points evenly distributed on the time interval [0, 20].

Responses and designs: p 0.35 1 1 0.9 0.8 0.7 responses 0.6 0.5 0.4 [A](t) [B](t) [C](t) 0.3 0.2 0.1 0 0 5 10 15 20 time

Responses and designs: p 0.15 1 1 0.9 0.8 0.7 responses 0.6 0.5 0.4 [A](t) [B](t) [C](t) 0.3 0.2 0.1 0 0 5 10 15 20 time

Responses and designs: p 0.05 1 1 0.9 0.8 0.7 responses 0.6 0.5 0.4 [A](t) [B](t) [C](t) 0.3 0.2 0.1 0 0 5 10 15 20 time

Variance function: p 0.35 1 variance function 5 4 3 2 0.7 0.65 0.6 0.55 0.5 0.45 0.35 0.25 0.2 1 0.15 0.1 0.05 0 0 5 10 15 20 0 time 0.4 0.3 weights

Variance function: p 0.15 1 5 0.3 4 0.25 variance function 3 2 0.2 0.15 0.1 weights 1 0.05 0 0 5 10 15 20 0 time

Variance function: p 0.05 1 5 0.1 4 variance function 3 2 0.05 weights 1 0 0 5 10 15 20 0 time

Convergence: p 0.35 1 x 10 5 2.4 2.2 2 1.8 det( M( p (k) )) 1.6 1.4 1.2 1 0.8 0.6 0 1 2 3 4 5 6 k

Convergence: p 0.15 1 2.4 x 10 5 2.2 2 1.8 det( M( p (k) )) 1.6 1.4 1.2 1 0.8 0.6 0 1 2 3 4 5 6 7 8 9 k

Convergence: p 0.05 1 x 10 5 2 det( M( p (k) )) 1.5 1 0.5 0 2 4 6 8 10 12 k

Conclusions A simple algorithm was developed for constructing constrained D-optimum designs on finite design spaces. Extensive numerical experiments demonstrate that it can outperform approaches based on the use of sophisticated general-purpose nonlinear programming solvers. Its unquestionable advantage is the simplicity of implementation which does not require any additional numerical routines, nor painstaking programming efforts. A refinement: Restricted simplicial decomposition based on the observation that a particular feasible solution, such as the optimal one, can be represented as the convex combination of an often much smaller number of extreme points than that implied by Carathéodory s Theorem (Hearn et al., 1985; 1997; Ventura and Hearn, 1993).

Conclusions A simple algorithm was developed for constructing constrained D-optimum designs on finite design spaces. Extensive numerical experiments demonstrate that it can outperform approaches based on the use of sophisticated general-purpose nonlinear programming solvers. Its unquestionable advantage is the simplicity of implementation which does not require any additional numerical routines, nor painstaking programming efforts. A refinement: Restricted simplicial decomposition based on the observation that a particular feasible solution, such as the optimal one, can be represented as the convex combination of an often much smaller number of extreme points than that implied by Carathéodory s Theorem (Hearn et al., 1985; 1997; Ventura and Hearn, 1993).

Conclusions Apart from that, some improvements aimed at removing nonoptimal support points proposed by Luc Pronzato can be incorporated in the restricted master problem to speed up its solution. The method can be incorporated into to find upper bounds to the maximum value of the objective function in the design of a monitoring network for parameter estimation of systems described by partial differential equations. Using this technique in conjunction with the branch-and-bound method, it was then possible to select hundreds of gaged sites from among thousands of admissible sites within no more than five minutes on a low-cost PC (Uciński and Patan, 2007).

Conclusions Apart from that, some improvements aimed at removing nonoptimal support points proposed by Luc Pronzato can be incorporated in the restricted master problem to speed up its solution. The method can be incorporated into to find upper bounds to the maximum value of the objective function in the design of a monitoring network for parameter estimation of systems described by partial differential equations. Using this technique in conjunction with the branch-and-bound method, it was then possible to select hundreds of gaged sites from among thousands of admissible sites within no more than five minutes on a low-cost PC (Uciński and Patan, 2007).

Conclusions Although the interest here was on constructing D-optimum designs under bound constraints, the same simplicial decomposition technique can be applied to other smooth optimality criteria, e.g., the A-optimality one and other linear constraints on the design weights can be easily included. Efficient parallelization is possible via Parallel Variable Distribution (Ferris and Mangasarian, 1994; Solodov, 1998). Extension to continuous designs (Ermoliev et al., 1985; Higgins and Polak, 1990; Cook and Fedorov, 1995; Shapiro and Ahmed, 2004).

Conclusions Although the interest here was on constructing D-optimum designs under bound constraints, the same simplicial decomposition technique can be applied to other smooth optimality criteria, e.g., the A-optimality one and other linear constraints on the design weights can be easily included. Efficient parallelization is possible via Parallel Variable Distribution (Ferris and Mangasarian, 1994; Solodov, 1998). Extension to continuous designs (Ermoliev et al., 1985; Higgins and Polak, 1990; Cook and Fedorov, 1995; Shapiro and Ahmed, 2004).

Conclusions Although the interest here was on constructing D-optimum designs under bound constraints, the same simplicial decomposition technique can be applied to other smooth optimality criteria, e.g., the A-optimality one and other linear constraints on the design weights can be easily included. Efficient parallelization is possible via Parallel Variable Distribution (Ferris and Mangasarian, 1994; Solodov, 1998). Extension to continuous designs (Ermoliev et al., 1985; Higgins and Polak, 1990; Cook and Fedorov, 1995; Shapiro and Ahmed, 2004).