An algorithm based on Semidefinite Programming for finding minimax optimal designs

Size: px

Start display at page:

Download "An algorithm based on Semidefinite Programming for finding minimax optimal designs"

Ethan Cox
6 years ago
Views:

1 Zuse Institute Berlin Takustr Berlin Germany BELMIRO P.M. DUARTE, GUILLAUME SAGNOL, WENG KEE WONG An algorithm based on Semidefinite Programming for finding minimax optimal designs ZIB Report (December 2017)

2 Zuse Institute Berlin Takustr Berlin Germany Telephone: Telefax: bibliothek@zib.de URL: ZIB-Report (Print) ISSN ZIB-Report (Internet) ISSN

3 An algorithm based on Semidefinite Programming for finding minimax optimal designs Belmiro P.M. Duarte a,b, Guillaume Sagnol c, Weng Kee Wong d a Polytechnic Institute of Coimbra, ISEC, Department of Chemical and Biological Engineering, Portugal. b CIEPQPF, Department of Chemical Engineering, University of Coimbra, Portugal. c Technische Universität Berlin, Institut für Mathematik, Germany. d Department of Biostatistics, Fielding School of Public Health, UCLA, U.S.A. Abstract An algorithm based on a delayed constraint generation method for solving semiinfinite programs for constructing minimax optimal designs for nonlinear models is proposed. The outer optimization level of the minimax optimization problem is solved using a semidefinite programming based approach that requires the design space be discretized. A nonlinear programming solver is then used to solve the inner program to determine the combination of the parameters that yields the worst-case value of the design criterion. The proposed algorithm is applied to find minimax optimal designs for the logistic model, the flexible 4-parameter Hill homoscedastic model and the general n th order consecutive reaction model, and shows that it (i) produces designs that compare well with minimax D optimal designs obtained from semi-infinite programming method in the literature; (ii) can be applied to semidefinite representable optimality criteria, that include the common A, E, G, I and D-optimality criteria; (iii) can tackle design problems with arbitrary linear constraints on the weights; and (iv) is fast and relatively easy to use. Key words: Cutting plane algorithm, Design efficiency, Equivalence theorem, Model-based optimal design, Nonlinear programming. Correspondence to: Polytechnic Institute of Coimbra, ISEC, Department of Chemical and Biological Engineering, R. Pedro Nunes, Coimbra, Portugal. addresses: bduarte@isec.pt (Belmiro P.M. Duarte ), sagnol@math.tu-berlin.de (Guillaume Sagnol), wkwong@ucla.edu (Weng Kee Wong) An example of the code is presented as Supplementary Material. Preprint submitted to Computational Statistics and Data Analysis January 4, 2018

4 1. Motivation We consider the problem of determining model-based optimal designs of experiments (M-bODE) for statistical models. Such a problem has increasing relevance today to control costs with broad applications in areas such as biomedicine, engineering, and pharmaceutical industry, to name a few (Berger and Wong, 2009; Goos and Jones, 2011; Fedorov and Leonov, 2014). M-bODE are useful because they can provide maximum information with high statistical efficiency at minimum cost. Our setup is that we have a given parametric model defined on a compact design space, a given design criterion and a given budget, that typically translates into having a predetermined total number of observations, n, available for the study. The design problem is to determine optimal design points that describe the experimental conditions and whether replicates are required at each of these design points, subject to the requirement that they sum to n. These design issues involve hard combinatorial optimization problems that are known to be NPhard (Welch, 1982). However, the limit problem where n, which is the focus of this paper, and in which we search the optimal proportion of the total number of trials to be performed at each individual design point, is an easier, convex continuous optimization problem. In the field of M-bODE, mathematical programming approaches have been successfully applied to solve design problems. Some examples are Linear Programming (Gaivoronski, 1986; Harman and Jurík, 2008), Second Order Conic Programming (Sagnol, 2011; Sagnol and Harman, 2015), Semidefinite Programming (SDP) (Vandenberghe and Boyd, 1999; Papp, 2012; Duarte and Wong, 2015), Semi-Infinite Programming (SIP) (Duarte and Wong, 2014; Duarte et al., 2015), Nonlinear Programming (NLP) (Chaloner and Larntz, 1989; Molchanov and Zuyev, 2002), NLP combined with stochastic procedures such as genetic algorithms (Heredia- Langner et al., 2004; Zhang, 2006), and global optimization techniques (Boer and Hendrix, 2000; Duarte et al., 2016). Traditional algorithms for finding optimal designs are reviewed, compared and discussed in Cook and Nachtsheim (1982) and Pronzato (2008), among others. Mandal et al. (2015) provides a review of algorithms for generating optimal designs, including nature-inspired meta-heuristic algorithms, which are increasingly used in computer science and engineering to solve high dimensional complex optimization problems. When the design criterion is not differentiable, finding an optimal design for a general nonlinear model is a difficult computational task. For instance, consider finding a minimax (or maximin) optimal design which has a non-differentiable criterion. The design criterion remains convex and there is an equivalence the- 2

5 orem for checking the optimality of the design. However in practice, there are considerable difficulties in finding and checking whether a design is optimal under the minimax framework (Noubiap and Seidel, 2000). In particular, the problem has two or more levels of optimization and the non-differentiability of the criterion requires mathematical programming based algorithms that can compute sub-gradient and iteratively evaluate the global or local (lower/upper) bounds to solve the optimization problem. Minimax design problems abound in practice. For example, one may wish to estimate the overall response surface in a dose response study. A common design criterion is to consider the areas where the largest predictive variance may occur and find a design that minimizes the maximum predictive variance. The outer level program finds a design after the inner problem determines the set of model parameters that results in the largest possible predictive variance. Theoretically, we can use SIP to tackle optimal design problems for any nonlinear model, see for example Duarte and Wong (2014). However, there are three potential issues with the SIP approach: (i) the task to program from scratch the various functionals of the inverse of the Fisher Information Matrix FIM (e.g. determinant, trace) for the various design criteria can be complex and so may limit generalization of the algorithm to solve other problems; (ii) the SIP-based approach finds the optimal design in the outer optimization problem (which is a hard problem) using a NLP solver that does not guarantee global optimality of the solution unless global optimization techniques are employed; and (iii) the SIP-based algorithm determines the number of support point for the optimal design iteratively, which can be time consuming. This is in contrast to our proposed SDP-NLP combined approach where (i) we solve the outer optimization problem using SDP which guarantees the global optimality of the design; (ii) we use the NLP solver to only solve the inner optimization program; (iii) find the number of support points for the optimal design simultaneously using SDP; and (iv) our method optimizes the design points from a pre-determined candidate set of points versus having to search for the number and the design points over a continuous domain. Our main contribution is an algorithm to determining minimax optimal designs using a combination of mathematical programming algorithms that on turn stand on deterministic replicable methods. The algorithm creatively combines SDP and NLP to find minimax A, E and D optimal designs easily and realize the efficiency gain in computing. In particular, the convex solver is able to identify support points of the design (within the predetermined set of points) by assigning zero-weight to non-support points for a broad class of semidefinite representable criteria. Our approach is flexible in that it can incorporate other con- 3

6 straints and strategies when appropriate. For instance, adaptive grid techniques recently proposed by Duarte et al. (2017) may be incorporated into our algorithm to refine the support points and collapse those that are close in proximity using a pre-defined ɛ COL -vicinity tolerance. The key advantages in our approach are that the SDP solver guarantees that it finds the global optimum in polynomial time and the NLP solver only finds the parameter combination at which a locally optimal design is least efficient, which is, in most of the problems, a low dimension program. Another innovative aspect is using the structure of the plausible region of the parameters to judiciously construct the initial approximation of the continuous domain which increases the convergence rate to the solution. Section 2 presents mathematical background for the SDP and NLP formulation problems. Section 3 provides the specifics for the SDP and NLP formulations for the outer and inner levels of the optimization problem. Section 4 presents three applications of the algorithm to find minimax optimal designs for nonlinear models where the nominal values of the parameters of interest all belong to a userspecified set called the plausibility region. We first consider the logistic model and then the more flexible and widely used 4-parameter homoscedastic Hill model to test if our proposed algorithms can generate minimax optimal designs similar to those reported in the literature. In the third application, we test our algorithm using the n th order consecutive reaction model described by ordinary differential equations. Section 5 concludes with a summary. 2. Background This section provides the background material and the mathematical formulation for finding optimal experimental designs for nonlinear models. In section 2.1 we introduce the minimax optimal design problem. Section 2.2 introduces the fundamentals of SDP and section 2.3 presents the basics of NLP. We use bold face lowercase letters to represent vectors, bold face capital letters for continuous domains, blackboard bold capital letters for discrete domains, and capital letters for matrices. Finite sets containing ι elements are compactly represented by [ι] = {1,, ι}. Throughout we assume we have a regression model with a univariate response and several regressors x X R n x where n x is the dimension of the design space. The mean response at x is E[y x, p] = f (x, p), (1) 4

7 where f (x, p) is a given differentiable function, E[ ] is the expectation operator with respect to the error distribution and the vector of unknown model parameters is p P R n p. Here, P is a user-selected n p -dimensional cartesian box P n p j=1 [l j, u j ] with each interval [l j, u j ] representing the plausible range of values for the j th parameter. The exact optimal design problem is: given a design criterion and a predetermined sample size, n, select a set of n runs using the best combinations of the levels of the regressors to observe the responses. Here, we focus on approximate or continuous designs, which are large sample designs so that the replicates can be viewed as proportions of the total number of observations to be taken at the design points. A continuous design, ξ, is characterized by the number of support or design points, k, their locations x i R n x, i [k], from a user-specified design space X and the proportion of the total number of observations, w i, to assign at each design point. Clearly, w i (0, 1), and w 1 + w w k = 1. In practice, continuous designs are implemented by taking roughly n w i replicates at level x i, i [k] after rounding n w i to an integer subject to the constraint n w n w k = n. Advantages of working with continuous designs are many, and there is a unified framework for finding optimal continuous designs for M-bODE problems when the design criterion is a convex function on the set of all approximate designs (Fedorov, 1980). In particular, the optimal design problem can be formulated into a mathematical optimization program with convex properties and equivalence theorems are available to check the optimality of a design in a practical way. Consider a continuous design ξ with k points, where the weight of the i th design point x T i = (x i,1,..., x i,nx ) is w i, with k i=1 w i = 1. We identify such a design with the discrete probability measure ξ = k i=1 w i δ xi, and simply represent it by a list of k vectors (x T i, w i), i [k]. Under mild regularity assumptions of the probability density function of the observations and design ξ, the variancecovariance matrix of Maximum Likelihood Estimates is well approximated by the inverse of the Fisher Information Matrix (FIM) and attains the lower bound in Cramér-Rao inequality (Rao, 1973). Consequently, one can view an optimal design problem as an allocation scheme of the covariates to minimize in some sense the variance-covariance matrix. Given the relation between the variancecovariance matrix and the FIM, the worth of the design ξ is measured by its FIM, which is the matrix with elements equal to the negative of the expectation of the second order derivatives of the log-likelihood of all observed data with respect to the parameters. When responses are independent, the normalized FIM of a 5

8 continuous design ξ is [ M(ξ, p) = E p ( )] L(ξ, p) = p T X M(δ xi, p)ξ(dx) = k w i M(δ xi, p). (2) i=1 Here, L(ξ, p) is the log-likelihood function of the observed responses using design ξ, δ x is the degenerate design that puts all its mass at x and M(ξ, p) is the global FIM from the design ξ for making inference on the parameter vector p. Sometimes, the term global FIM is also referred to as the total FIM, or simply, FIM and, the FIM from a degenerate design is called elemental. In what is to follow, we show our proposed semidefinite programming based approach is well suited to solve minimax optimal design problems. The methodology requires the design space X be discretized into many points. Let X be the discretized version of X with say q points. A common and simple way to discretize X is to use a grid set with equally-spaced points x units apart on each of the design spaces for all the regressor variables. We then search a probability measure χ on X so that M(δ x, p) χ(x) x X approximates (2) as close as possible. When errors are normally and independently distributed, the volume of the asymptotic confidence region of p is proportional to det[m 1/2 (ξ, p)], and so maximization of the determinant of the FIM leads to the smallest possible volume. Other design criteria maximize the FIM in different ways and are usually formulated as a convex function of the FIM. For example, when p is fixed, the locally D, A and E optimal designs are each, respectively, defined by { ξ D = arg min det[m(ξ, p) 1 ] } 1/n p, (3) ξ Ξ { ξ A = arg min tr[m(ξ, p) 1 ] }, (4) ξ Ξ ξ E = arg min ξ Ξ {1/λ min[m(ξ, p)]}. (5) Here λ min (λ max ) is the minimum (maximum) eigenvalue of the FIM, and Ξ is the set of all designs on X. We note that 1/λ min [M(ξ, p)] = λ max [M(ξ, p) 1 ], and by continuity, the design criteria (3-5) are defined as + for designs with singular information matrices. 6

9 More generally, Kiefer (1974) proposed a class of positively homogeneous design criteria defined on the set of symmetric n p n p positive semidefinite matrices S + n p. If M(ξ, p) S + n p, this class is Φ δ [M(ξ, p) 1 ] = [ ] 1/δ 1 tr(m(ξ, p) δ ), (6) n p where δ 1 is a parameter. We note that Φ δ is proportional to (i) [tr(m(ξ, p) 1 )], which is A optimality when δ = 1; (ii) 1/λ min [M(ξ, p)], which is E optimality when δ = + ; and (iii) [det[m(ξ, p)] 1 ] 1/n p, which is D optimality when δ 0. Because these criteria are convex on the space of information matrices, the global optimality of any design ξ in X can be verified using equivalence theorems, see for example (Whittle, 1973; Kiefer, 1974; Fedorov, 1980). They are derived from directional derivative considerations and have a general form, with each convex criterion having its own specific form. For instance, the equivalence theorems for D and A optimality are as follows: (i) ξ D is locally D-optimal if and only if tr { [M(ξ D, p)] 1 M(δ xi, p) } n p 0, x X; (7) and (ii) ξ A is locally A-optimal if and only if tr { [M(ξ A, p)] 2 M(δ xi, p) } tr { [M(ξ A, p)] 1} 0, x X. (8) We call the functions on the left side of the inequalities in (7) and (8) dispersion functions Minimax designs Nonlinear models are common in many areas with typical applications ranging from engineering to pharmacokinetics. For such models, the FIM depends on the parameters and consequently all design criteria, which are dependent on the FIM, depend on the unknown parameters that we want to estimate. When nominal values are assumed for these parameters, the resulting designs are termed locally optimal. The design strategies commonly used to handle the dependence noticed above include the use of: (i) a sequence of locally optimal designs, each computed using the latest estimate of p; (ii) Bayesian designs that optimize the expectation of the optimality criterion value averaged over the prior distribution of model parameters p in P (Chaloner and Larntz, 1989); (iii) minimax designs that minimize the maximal value of the criterion from the unknown values of the 7

10 model parameters in P (Wong, 1992).In either case the prior distribution and the set P are assumed known. Here we focus on finding minimax optimal designs. Minimax design criteria have many forms and in this paper, we assume that there is a priori minimal knowledge of the true values of the parameters. One scenario is that the user has a plausible range of values for each model parameter. A reasonable goal is to find a design that maximizes the minimal accuracy for the parameter estimates regardless which set of model parameter values in the plausible region is the truth. The maximin, or equivalently, the minimax Φ δ -optimal design sought is the one that minimizes max p P Φ δ[m(ξ, p) 1 ] (9) among all designs in Ξ. The minimax optimal design problems for A, E and D optimality criteria are constructed from (9) for δ = 1, + and δ 0, respectively, see (3-5). The mathematical program for problem (9) is min ξ Ξ s.t max Φ δ[m(ξ, p) 1 ] p P k w i = 1. i=1 (10a) (10b) As an example, application of (10) to find a minimax E optimal design yields: min ξ Ξ s.t max (λ min[m(ξ, p)]) 1 p P k w i = 1 i=1 (11a) (11b) and similar formulations apply for other criteria in the Φ δ class. Our minimax design problems remain convex and so equivalence theorems for verifying optimality of a minimax design can also be constructed. The reader is referred to Berger and Wong (2009) for details on the general equivalence theorems for a minimax type of optimality criteria. This is discussed more generally in Wong (1992), who gave an unified approach to constructing minimax optimal designs. Duarte and Wong (2014), among others, displayed the equivalence theorems for minimax D, A and E optimality and for space consideration, we do not discuss them in this paper. They are more complicated than the ones for 8

11 locally optimal designs because sub-gradients are involved in the dispersion functions. These plots represent the dispersion functions and have a general pattern; if the design under investigation is optimal, the dispersion function must be bounded from above by zero with equality at every support point of the design. Figure 2 in Section 4 display examples of such plots. We next review programming based tools for solving such minimax problems. Finding the design ξ in (10a) is called the outer problem of the minimax program. The inner problem determines, for a given design, the values of the model parameters p that make the criterion least attainable. The overall optimization problem is complex because it poses several challenges: (i) the objective function of the outer problem is not differentiable, and consequently the subgradient used in the approximation may lead to many maximizers; (ii) the inner problem function can be sensitive and may require long computing time to solve it accurately; and (iii) global maxima of the inner problem are required to solve the minimax problem (Rustem et al., 2008) Semidefinite programming We use semidefinite programming to solve the outer problem by first discretizing the design space before applying a SDP solver to find the optimal design. Details on general use and application of SDP to search for optimal designs for linear models are available in Vandenberghe and Boyd (1996). Additional applications include finding (i) criterion-robust maximin designs for linear models (Filová et al., 2011), (ii) D optimal designs for polynomial models and rational functions (Papp, 2012); and (iii) Bayesian optimal designs for nonlinear models (Duarte and Wong, 2015). This section reviews briefly the basics of this class of mathematical programs. Let S + n p be the space of n p n p positive semidefinite matrices. A function ϕ : R m 1 R is called semidefinite representable (SDr) if and only if inequalities of the form u ϕ(ζ) can be expressed by linear matrix inequalities (LMI), see for example, Papp (2012) and Sagnol (2013). That is, ϕ(ζ) is SDr if and only if there exists some symmetric matrices M 0,, M m1,, M m1 +m 2 such that m 1 m 2 u ϕ(ζ) v R m 2 : u M 0 + ζ i M i + v j M m1 + j 0. (12) Here, stands for the ordering associated with the semidefinite cone where A B holds if and only if A B S + n p. For a given vector c, the optimal values, ζ R m 1, i=1 j=1 9

12 of the SDr functions are found from semidefinite programs of the form: m 1 max ζ ct ζ, ζ i M i M 0 0. (13) i=1 In the design context, c is a vector of known constants that depends on the design problem and the matrices M i, i [m 1 ] contain the local FIM s and other matrices produced by the reformulation of the functions ϕ. The decision variables in the vector ζ contain the weights w i, i [q] of the optimal design and optimized values for the auxiliary variables introduced, and q is pre-determined by the discretization scheme. The outer problem corresponds to finding a design for a finite set of parameter combinations p found by solving (13) subject to the linear constraints on the weights such that they are non-negative and sum to unity. Ben-Tal and Nemirovski (2001, Chap. 2-3) provides a list of SDr functions useful in SDP formulations for solving M-bODE problems, see Boyd and Vandenberghe (2004, Sec. 7.3) for the basis. Sagnol (2013) showed that each criterion in the Kiefer s class of optimality criteria in (6) is SDr for all rational values of δ [ 1, + ) where δ = 0 is taken to mean the limiting case when δ 0, i.e., D optimality criterion. We note that when there are a finite number of α SDr functions ϕ 1,, ϕ α, then min i=1,,α {ϕ i } is also SDr, implying that one can formulate the worst-case scenario problem by a semidefinite representable hypograph Nonlinear programming Nonlinear programming seeks to find the optimum of a mathematical program where some of the constraints or the objective function are nonlinear. A general nonlinear program has the following form: max p P f (p) (14a) s.t g(p) 0 h(p) = 0 (14b) (14c) where f (p) is a linear/nonlinear function, g(p) is a set of inequality constraints which may be linear and/or nonlinear and h(p) is a set of equality constraints. Here, p are variables to be optimized from a known compact set P. We use nonlinear programming to solve the inner problem of (9) for each fixed design ξ. Our objective function is Φ δ [M(ξ, p) 1 ] criterion and the solution to the inner problem is the vector of parameters that result in the ξ being least efficient. 10

13 Finding the solution of the nonlinear program (14) is the most difficult step in the algorithm (Pázman and Pronzato, 2014) because it requires finding globally optimal solutions in the compact domain P to guarantee the convergence of the upper bound as we will show in 3. There are several algorithms commonly used to solve NLP problems and they include the following: General Reduced Gradient (GRG) (Drud, 1985, 1994), Sequential Quadratic Programming (SQP) (Gill et al., 2005), Interior-Point (IP) (Byrd et al., 1999), and Trust-Region (Coleman and Li, 1994). Ruszczyński (2006) provides an overview of NLP algorithms. The NLP solvers employed in our proposed algorithm do not guarantee global optimality, only local optimality. This is in contrast to those based on Lipschitz global optimization techniques that guarantee convergence but are a lot less computationally efficient. There are heuristics, such as resorting to multiple restarts or simulated annealing that can be used to find a good local optimum, but in general it would require the use of branch and bound techniques to prove global optimality (Sahinidis, 2014). Most of Lipschitz global optimization solvers are based on branch and bound techniques or stochastic procedures, and may require long CPU times. Consequently, we avoid them for practical reasons, and use a gradient-based local NLP solver. These tools might be unable to find global optima but they guarantee that local optima can be found in mild computational time, and since the problems are of low dimension we believe that in most cases they coincide with the global optima. To increase the efficiency and accuracy of the NLP solver we use an automatic differentiation tool to generate the jacobian analytically. With the same purpose, we developed a global nonlinear programming tool based on a multistart heuristic algorithm that was compared with the NLP solver. 3. Algorithms This section describes our proposed algorithm for finding minimax Φ δ -optimal designs over Ξ, or more generally, over Ξ A,b where Ξ A,b := {{x i, w i } i=1,...,k : w 0, i w i = 1, Aw b}, A is a matrix and b is a vector. In section 3.1 we formulate and solve the outer optimization problem using SDP, and in section 3.2 we tackle the inner optimization problem using a NLP solver. The minimax optimal design problem is handled in program (10) after the entire covariate design space X is discretized. The optimization problem has finitely many variables to be optimized over a given feasible set described by infinitely many constraints, one for each p P (López and Still, 2007). This falls into the class of semi-infinite programming, where Hettich and Kortanek (1993) and 11

14 López and Still (2007) each provides a survey of the theory, applications and recent developments. The algorithms employed to solve SIP problems fall into three classes: (i) exchange methods; (ii) discretization based methods; and (iii) local reduction based methods (Hettich et al., 2001). Here, we use an exchange based procedure similar to the one proposed by Blankenship and Falk (1976), and further exploited by Žakovíc and Rustem (2003) and Mitsos and Tsoukalas (2015) among others. To convert (10) into an equivalent semi-infinite program we note that for a given value of τ LB R: max p P Φ δ[m(ξ, p) 1 ] τ LB Φ δ [M(ξ, p) 1 ] τ LB, p P. (15) Accordingly, the equivalent semi-infinite program obtained using a relaxation procedure is (Shimizu and Aiyoshi, 1980): min ξ Ξ,τ LB T τ LB (16a) s.t Φ δ [M(ξ, p) 1 ] τ LB, p P (16b) k w i = 1 (16c) i=1 where T = [τ L, τ U ] R is a closed interval with τ L being a large negative value, and τ U a large positive one. The semi-infinite program (16) is called the master problem when P is infinite. One advantage of our approach is that additional linear constraints on the design weights, if any, can be handled in a straightforward fashion. Indeed, if a minimax optimal design is sought over Ξ A,b, one must simply add the following constraints into the master problem: Aw b. (16d) When P is replaced by a finite subset P of P, we obtain an approximation of program (16) which is commonly designated a restricted master problem (RMP). By construction, the optimal solution of this RMP provides a lower bound of the optimal value of (16), see for example Mutapcic and Boyd (2009), and our strategy is to solve the RMP using semidefinite programming. A delayed constraint generation method is used to solve the original problem, which has an infinite number of constraints. We do so by approximating the problem by a finite set of points P sampled from P. The constraints might be saturated at the optimum 12

15 (Terry, 2009) and the term delayed means that not all constraints are present at the beginning. Initially, the set P is populated with random points and at each iteration that follows, additional vectors p that make the SDP-generated design least efficient are included. The optimization problem (16) is solved iteratively, alternating between two phases as follows. The proposed algorithm has similarities with the approach used in Pronzato and Walter (1988) and Walter and Pronzato (1997, Sec ) to find minimax designs. Here, we start the convergence with a finite number of vectors P, not only one as in the latest reference. Similarly to Walter and Pronzato (1997, Sec ), the outer and the inner optimization problems are solved iteratively to convergence. The inner program is formulated as a NLP and serves to find the vector of parameters at which a locally optimal design is less efficient; the outer program, formulated as a SDP in a previously discretized design space, is to determine the optimal design for a finite set P that replaces P. The vectors of parameters that solve the inner problem for a given locally optimal design are appended to the set of instances P to increasingly constraining the design. Notice that P replacing P changes at each iteration. Suppose that the initial set P (0) has m 0 elements and at the j th iteration, this set becomes P ( j) and has m 0 + j elements. This assumes that we add one element from P to the original finite set P (0) at each iteration, and this element is the worst p for a given local optimum design ξ. The algorithm iterates between the outer program P 1 and the inner problem P 2 until convergence. If ξ (0) is the initial design and ξ ( j 1) is the optimal design at the ( j 1) th iteration, we solve the Phase 1 outer problem P 1 (16) by finding a design ξ ( j) and a lower bound of the minimax program, τ ( j) LB after P is replaced by P( j 1), and the Phase 2 inner problem P 2 by finding the set of model-parameter values p ( j) that yields the worst (largest) possible value of the criterion Φ δ [M(ξ, p) 1 ] for ξ ( j). The solution provides an upper bound τ ( j) UB for the minimax problem and iterates after adding p ( j) to P ( j 1) to obtain for the next iteration. P ( j) = P ( j 1) p ( j), (18) The above algorithm stops when the two bounds converge and reaches an ɛ- optimal solution, where ɛ is a user-specified small positive constant. In practice 13

16 this means that after each iteration one verifies whether the condition τ ( j) j) UB τ( LB ɛ, (19) τ ( j) UB is satisfied. Blankenship and Falk (1976) showed that the procedure is guaranteed to converge to the global optimum in a finite number of iterations. Moreover, we shall see in Proposition 1 that the stopping criterion (19) ensures a lower bound on the efficiency of the returned design. Our algorithm has similarities with algorithms developed by Melas (2006, Sec ) using the functional approach for determining maximin and standardized maximin optimal designs. The key idea is that the optimal support points and the associated optimal weights can be treated as functions of the estimated parameters and, as such, can be developed into a Taylor series around the nominal values of these parameters. The author uses standard arguments to prove that a sequence of locally optimal designs obtained for a given discrete set of parameter combinations contains a weakly convergent subsequence with a limit, and the standardized maximin optimality can be checked with a general equivalence theorem, see Dette et al. (2007). The approach is elegant and behaves well numerically but because it exploits problem specificities may be somewhat limited for general applications. To describe our algorithm in more detail, we need additional notation. Let υ(p) and E(P) be the sets of vertices and edges of P, respectively. Let m 0 be the user-specified sample size of P (0) and let U(Q) be the uniform distribution over the general compact domain Q enclosing a general sub-domain of the parameters plausible region. Let m 0,k m 0 be the number of samples p including the vertices and other points randomly generated by sampling E(P). Further, let m 0,r = m 0 m 0,k be the remaining number of samples that are generated by sampling randomly from the overall domain P and not only from the edges E(P). The sampling procedure uses a uniform random number generator to draw elements from the closed domain Q, see Press et al. (2007, Chap. 7). We use an empirical relation based on the structure of P to set the value of m 0,k. Since P is formed from the cartesian product of intervals for each of the n p parameters, the compact domain contains 2 n p vertices and n p 2 np 1 edges. We then set m 0,k = 2 n p + n p 2 np 1, and consequently m 0,r = m 0 2 n p n p 2 np 1. The collection of data samples in P (0) is constructed as follows: P (0) = { } p 1,..., p m0 (20) where p 1,..., p 2 np are the vertices of P, p 2 np +1,..., p m0,r are sampled from the uniform distribution U(E(P)) with one point from each edge, and p m0,r +1,..., p m0 are 14

17 drawn from U(P). The sampling scheme is similar to that used in the H-algorithm proposed by Fackle-Fornius et al. (2015) to first find a prior distribution for constructing Bayesian optimal designs and then use them to check the optimality of minimax standardized designs. All our examples in 4 use m 0 =50 but other values can be used. The above rule is empirical and seems to work well for finding the optimal designs sought here. We reason that the construction of P (0) relies on the properties of Φ δ [M(ξ, p) 1 ] and our experience is that while this function is not convex nor concave (Pronzato and Pázman, 2013) over the parameter space, the vectors p for which a design is least efficient are usually the vertices of the plausible region or points located on one of the edges of P. We exploit this feature and so include the vertices and some randomly chosen points of E(P) in the initial set. We find that such a construction can sometimes reduce the computation time, especially when the optimum of the inner problem is one of the vertices. Algorithm 1 below summarizes how we combine SDP with NLP to find minimax optimal designs with accompanying details on the subproblems in Phases 1 and 2. Sections 3.1 and 3.2 provide additional details about the formulations and numerical solvers used in Algorithm 1. Here and throughout, we denote the design obtained after convergence by ξ OPT and one set of worst-case values for the model parameter at convergence by p OPT. The minimax-criteria Ψ(ξ) := max p P Φ δ [M(ξ, p) 1 ] is nonnegative (and even positive at the optimum, cf. Pukelsheim (1993)) and positively homogeneous, so it is natural to define the efficiency of a design ξ by Eff(ξ) = Ψ(ξOPT ), Ψ(ξ) where ξ OPT is minimax-optimal. Uppon convergence, the following proposition bounds the efficiency of the design returned by Algorithm 1: Proposition 1. Let ξ OPT be the design returned by Algorithm 1, where we assume that each subproblem P 1 and P 2 is solved exactly (to global optimality). Then, Eff(ξ OPT ) 1 ɛ. More generally, if we assume that the subproblems P 1 and P 2 are solved within relative accuracy ε L and ε U, respectively, then the bound on the efficiency becomes Eff(ξ OPT ) (1 ɛ) 1 ε L 1 + ε U. Proof. We start with the case where each subproblem is solved exactly. By construction, τ ( j) j) LB and τ( UB are lower and upper bounds for Ψ(ξOPT ), and at each itera- 15

18 Algorithm 1 Algorithm to find minimax optimal designs procedure MinimaxOptimalDesign( x, m 0,r, m 0,k, τ L, τ U, ɛ, l j, u j, ξ OPT, p OPT ) Construct X using intervals x Discretization of the design space Construct P (0) Sample m 0 points from P according to (20) j 1 Initialize iterations counter τ (0) LB τl, τ (0) UB τu while (τ ( j 1) j 1) j 1) UB τ( LB )/τ( UB > ɛ do Convergence checking Solve P 1 to determine τ and ξ SDP problem τ ( j) LB τ, ξ( j) ξ Update τ ( j) LB Solve P 2 to determine τ and p NLP problem p ( j) p if τ < τ ( j 1) UB then Update opt. design and worst parameter τ ( j) UB τ, popt p, ξ OPT ξ else τ ( j) UB τ( j 1) UB Update τ ( j) UB end if P ( j) P ( j 1) p ( j) j j + 1 end while end procedure Update P Update the iteration counter tion, the design ξ OPT stored by the algorithm satisfies Ψ(ξ OPT ) τ ( j) UB. Indeed, the value τ returned by P 2 is the value of Ψ(ξ ( j) ). It follows that Eff(ξ) = Ψ(ξOPT ) Ψ(ξ) j) τ( LB τ ( j) UB which is at least 1 ɛ when the convergence criterion is satisfied. When the solutions to the subproblems P 1 and P 2 are approximate and not exact, the proof is similar. This follows by observing that Ψ(ξ OPT ) τ ( j) LB (1 ε L) and Ψ(ξ OPT ) τ ( j) UB (1 + ε U). Algorithm 1 has some similarities with that proposed by Duarte and Wong (2014), and the main differences are listed below: (a) the Phase 1 problem is formulated here as a semidefinite program problem instead of a nonlinear problem that requires global optimization solvers., 16

19 The use of SDP in Phase 1 allows us to find approximate designs in polynomial time but requires the discretization of the design space, here represented by a finite number of candidate points. In contrast to Duarte and Wong (2014), the algorithm herein does not require an initial estimate of the number of support points. (b) the Phase 2 problem is partially similar to that used in Duarte and Wong (2014). Specifically, the calculation of the answering set for the optimal design produced by Phase 1 is also represented as a NLP problem, and solved using a similar method, but does not require iterating the number of support points until the equivalence theorem is validated. This task is highly time consuming since it requires finding the maxima of the dispersion function for a given answering set, which is a rather challenging NLP with multiple optima. (c) the algorithm proposed herein can easily be extended to find minimax A and E optimal designs by taking advantage of the semidefinite representability of those criteria; the one proposed in Duarte and Wong (2014) works only for the D optimality criterion. In practice, Algorithm 1 can be extended to every semidefinite representable criterion used for designing a multiple regression model, see Sagnol (2011). (d) the set P (0) in our algorithm is formed by m 0 samples judiciously chosen from P, whereas in Duarte and Wong (2014), P (0) is formed by a singleton element p. This approach potentially increases the convergence rate. Algorithm 1 has common features with that proposed by Pázman and Pronzato (2014) for finding extended E and G optimal designs for protection against close-to-overlapping situations, and recently extended to find D, A and E k optimal designs for linear models (Burclová and Pázman, 2016). Here, we formulate the Phase 1 problem for finding optimal designs on a discrete design space as a SDP problem and exploit the semidefinite representability of D, A and E optimality criteria. This is in contrast with the work of Pázman and Pronzato (2014) where the problem is handled by Linear Programming. The SDP formulation is harder to solve than the LP problem because the former requires specific Interior Point algorithms (Boyd and Vandenberghe, 2004), but because it is broader in scope and in particular, the formulation can be generalized to include all problems with a SDr criterion. Further, we solve the Phase 2 problem as a NLP and consequently the convergence of the successive cutting planes that generate the upper bounds 17

20 is guaranteed if global optimization techniques are used. This desirable property is not found in the method proposed by Pázman and Pronzato (2014), where they used a combination of grid search and local optimization Phase 1 problem - P 1 The SDP formulation for solving the problem P 1 and obtaining the successive global lower bounds for the minimax program (10) are as follows. At the j th iteration, the goal is to find the design ξ ( j) with weights w ( j) = (w ( j) 1 the following SDP problem: min τ ( j) ξ ( j) Ξ,τ ( j) LB T LB..., w( j) q ) T in (21a) s.t Φ δ [M(ξ ( j), p) 1 ] τ ( j) LB, p P( j) (21b) q w ( j) i = 1, (21c) i=1 where for our interests, we set Φ δ [M(ξ ( j), p) 1 ] to be either [tr(m(ξ ( j), p) 1 )] for A optimality, (λ min [M(ξ ( j), p)]) 1 for E optimality, [det[m(ξ ( j), p)] 1 ] 1/n p for D optimality, and each w ( j) i is the weight of the point x T i X, i [q]. As we mentioned for the master problem (16), additional constraints of the form Aw b can be added to the above SDP formulation, for the case of optimization over Ξ A,b. To handle the SDP problems, there are user-friendly interfaces, such as cvx (Grant et al., 2012) or Picos (Sagnol, 2012), that automatically transform the constraints (21b) into a series of LMIs before passing them to SDP solvers such as SeDuMi (Sturm, 1999) or Mosek (Andersen et al., 2009). This is possible if Φ δ is SDr, which is true for our design criteria of interest. In our work, we solved all SDP problems using the cvx environment combined with the solver Mosek that uses an efficient interior point algorithm Phase 2 problem - P 2 This problem is to find the vector p given the design determined in Phase 1 which results in it having the smallest efficiency. The problem is non-convex and requires nonlinear programming tools. Its solution generates successive global upper bounds for the minimax optimal design problem, with one obtained for each locally optimal design. Let s assume that at the j th iteration the design ξ ( j) is already determined after solving (21). The problem to solve in this phase is τ = max p P Φ δ[m(ξ ( j), p) 1 ], (22) 18

21 and we use a NLP solver IPOPT based on an interior point algorithm (Wächter and Biegler, 2005). To increase the accuracy of the computed gradient, we use an automatic differentiation tool, ADiMat (Bischof et al., 2002), where its analytical representation is then passed to the solver to handle the program (22). We notice that IPOPT is a local nonlinear programming solver, and the problem (22) may have multiple local optima. To handle this issue we develop a global nonlinear programming tool based on a multistart heuristic algorithm similar to that proposed by Ugray et al. (2005), and in 4.1 we compare the local NLP with our algorithm. In every iteration we assumed that the optimum found by solving P 2 is global and τ is a global upper bound for the minimax problem. The optimal p is used to update the set of data samples in P ( j) employing the rule (18). In many situations, ɛ optimum is found in the first iteration by the SDP solver, and P 2 is used to compute the upper bound, and hence, to confirm the optimality of the design. 4. Results In this section, we apply our proposed algorithm in 3 to find minimax A, E and D optimal designs for the power logistic model, the 4-parameter logistic model and the general consecutive reaction model that represents the reactants concentration in a continuous stirred tank. The second is also called the Emax model or the Hill model. Such logistic models are commonly used to study the drug effects as the dose varies, see for example Ting (2006), and in other areas such as agronomy (Tsoularis and Wallace, 2002; Ludena et al., 2007). The third is described by a system of evolutive ordinary differential equations, and puts additional issues concerning the computation of the FIM. Except for the minimax D optimal designs for the logistic model, the other optimal designs, as far as we know, are not published in the literature. The logistic model is considered a benchmark test for algorithms for finding optimal designs of experiments, and and we use it here for comparison. First, we find minimax D optimal designs for the logistic model for different uncertainty parameter regions and compare the results with those of Duarte and Wong (2014). We then apply the algorithm to find minimax A and E optimal designs for all models. In all cases, we assume that there is a known plausible set of values for each parameter. All computation in this paper were carried using on an Intel Core i7 machine (Intel Corporation, Santa Clara, CA) running 64 bits Windows 10 operating system with 2.80 GHz. The relative and absolute tolerances used to solve the SDP problems were set to 10 5 in all problems. Similarly, the relative and absolute 19

22 tolerances of the NLP solver were also set to 10 5 in all problems. The value of ɛ used in the convergence criterion in Algorithm 1 is Example 1 - Logistic model We consider the power logistic model proposed by Prentice (1976) and commonly used in dose response studies when the outcome is binary: E[y x, p] = 1 { 1 + exp[ β (x µ)] } s, x X, p P (23) where X is a used-defined bounded interval, p = [β, µ, s] T is the vector of model parameters assumed to lie in a known compact set P. The probability of a response at dose x X is y(x, p) and the binary outcome is coded as 1 for response and 0 otherwise. When s = 1, we have the logistic model, other values of s provide more flexibility in capturing skewness and kurtosis, but for comparison purposes we first assume the simpler case when s = 1. Accordingly, let p = [β, µ] T and assume that P = [µ L, µ U ] [β L, β U ] which contains all plausible values of the two parameters. Here, µ L is the lower bound of µ and µ U is its upper bound. Similarly, β L is the lower bound of β and β U is its upper bound. The FIM of the design at the point x i is M(x i, p) = h(x i, p) h(x i, p) T, where h(x i, p) = 1 E[y xi, p] (1 E[y x i, p]) ( E[y xi, p] p ), E[y x i, p] p = E[y x i,p] β E[y x i,p] µ The initial population in P (0) of the numerical tests presented in this section was constructed using the 2 2 vertices plus 16 points randomly sampled from the edges of P with one per edge. What we meant is that we sample 16 points uniformly, each on an edge selected at random. More precisely, we use the following procedure to sample a vector θ on an edge of the plausible region: we first sample one index j uniformly in {1,..., n p }. Then, we sample θ j uniformly at random in the interval [l j, u j ], and for i = 1,..., n p, i j, we draw θ i from the bounds {l i, u i } using a binary random number generator. In addition, we randomly sample 30 more points from P using a uniform random number generator, resulting in a total of 50 vectors for p. In our analysis we first study the impact of the size of the plausible parametric region on the optimal design and the computation time. We consider X [ 1, 5] and four different regions for p that result from combining the intervals [1.0, 1.25] and [1.0, 3.0] for β with the intervals [0.0, 1.0] and [0.0, 3.5] for µ. Next, we study the impact of the design space range and solve the problem for X [3.0, 9.0] 20.

23 with µ [6.0, 8.0] and β [1.0, 1.25] and β [1.0, 3.0], respectively. In all these tests, the design space was discretized using equally spaced points x = 0.02 unit apart, and so for each of the design spaces X [ 1, 5] and [3.0, 9.0], we have 301 candidate design points. Table 1 presents the D optimal designs for different plausibility parameter regions. Our experience is that the size of X seems to have a small impact on the speed of our proposed method. The grid obtained after discretizing the design space seems to have impact on the computational time required by the Phase 1 problem. In practice, dense grids result in more nodes, and consequently larger semidefinite programs are produced, thus increasing the difficulty in solving the optimal design problem for a set of samples of parameters. The algorithm performs potentially well with a larger number of points but the SDP solver might be unable to provide a solution in realistic time. Having denser grids allows us to obtain more accurate optimal designs closer to those obtained if X is not discretized. In practice, one may increase the size of X by keeping the size of the SDP problem the same by manipulating the discretization grid intervals to minimize the impact on computational time. The Phase 2 problem is independent on the design space because the search is made on the space of the parameters, and so has a small impact on the CPU time. The resulting designs in Table 1 show good agreement with the designs found with a SIP-based approach which assumes X is continuous, see Duarte and Wong (2014). In some cases, the SDP-based minimax designs have more support points than the SIP-generated minimax designs. The additional points are usually close to each other and depend on how we discretize the design space. Table 1 shows that larger plausible regions tend to result in designs with more support points, a finding already observed by other authors for Bayesian and minimax designs, see for example, Chaloner and Larntz (1989); Duarte and Wong (2014). The CPU times required by Algorithm 1 to find minimax D optimal designs is, on average, 8.56 times shorter than that required in the SIP approach in Duarte and Wong (2014). The efficiency of each design in Table 1 is computed as: Eff = ( ) det[m(ξ OPT, p OPT 1/np )] (24) det[m(ξ, p )] where ξ OPT is the optimal design obtained from the proposed algorithm, along with the worst-case parameter p OPT, and ξ is the design in Duarte and Wong (2014) for the corresponding answering set p. We note that the efficiencies of the designs obtained from the proposed algorithm relative the SIP-generated designs are all 21

24 between and and so this suggests our algorithm produces optimal or highly efficient minimax designs but using shorter CPU time. It is possible to find optimal designs with an efficiency below 1 ɛ, because the reference design ξ OPT in Proposition 1 is restricted to have support points on the grid and the designs ξ in Duarte and Wong (2014) are constructed assuming a continuous design space and so may perform better. The CPU time of the designs in Table 1 is larger for larger regions P, one reason being the increasing complexity of the NLPs solved to find p in each iteration. To analyze the impact of the grid density on the optimal design and CPU we consider P [1.0, 3.0] [0.0, 1.0], X [ 1, 5], and discretize the design space using equally spaced points with x = 0.08 (76 points), x = 0.04 (151 points), x = 0.02 (301 points) and x = 0.01 (601 points), respectively. Table 2 shows that (i) the designs obtained from different grid sets are very close and have a similar efficiency; and (ii) the CPU times are alike, where the larger size semidefinite program is compensated by the higher convergence rate.although the performance of the method does not deteriorate very much when the size of the grid increases (cf. Table 2), the SDP approach usually fails if it includes between 3000 and 6000 design points, mainly due to memory problems, depending on the hardware. In some multi-factor problems, it is not unusual to have millions of design points and the this approach cannot handle large design spaces unless coarser discretization grids are used. Finally, to study the accuracy of the NLP solver to find global optima in our algorithm, we implemented a multistart heuristic algorithm and compared its performance with the local NLP solver. To briefly describe the algorithm, we set the number of starting points used, n s, and generate n s vectors of parameters p employing a uniform random generator mechanism in Θ. Then, we solve the Phase 2 problem by calling IPOPT n s times using a different starting point in each call. The optimum and the objective function are saved, and subsequently the best is(are) picked. If a singleton is found, P is updated with a single vector p while when more than one optimum is obtained all of them are included in P, that augments more than a vector p per iteration. For comparison purposes we used n s = 20, and for all the examples in Table 1 we obtained the same optimal designs found with the local NLP solver. Although, the CPU time required increased in average 4.18 times. We also observed that in most of the problems and iterations the optimum is a single point which is in agreement with our result in Table 1 where p OPT is a singleton. Now, we analyze the accuracy of our algorithm in determining the vector p where the criterion is least attainable. Figure 1 displays the objective function, 22

25 Table 1: Minimax D optimal designs for the logistic model on X discretized using x = 0.02, for different plausible regions. The design space is X [ 1, 5] for plausible regions with the superscript and X [3, 9] for those with the superscript. [µ L, µ U ] [β L, β U ] [0.0, 1.0] [0.0, 3.5] [6.0, 8.0] [1.0, 1.25] (-0.84,0.3810) (-0.88,0.2746) (5.16, ) (-0.82,0.1190) (1.74,0.2254) (5.18,0.3428) (1.82,0.1190) (1.76,0.2254) (7.00, ) (1.84,0.3810) (4.38,0.2746) (8.82,0.3427) (8.84,0.0049) p OPT [1.25, 1.0] T [1.25, 3.5] T [1.25, 8.0] T CPU (s) Eff [µ L, µ U ] [β L, β U ] [0.0, 1.0] [0.0, 3.5] [6.0, 8.0] [1.0, 3.0] (-0.54,0.2190) (-0.40,0.0491) (5.66, ) (-0.52,0.1421) (-0.38,0.1516) (6.86,0.2408) (0.50,0.1193) (0.68,0.2350) (7.14, ) (0.52,0.1612) (1.52,0.0384) (8.34,0.2592) (1.52,0.0514) (1.54,0.0305) (1.54,0.3070) (2.02,0.0610) (2.82,0.1141) (2.84,0.1251) (3.88,0.0420) (3.90,0.1532) p OPT [3.0, 0.0] T [3.0, 3.5] T [3.0, 8.0] T CPU (s) Eff (x.xx, w.wwww) (design point, weight) (det[m(ξ OPT, p)]) 1/n p, for the optimal design obtained with the Algorithm 1 for P [1.0, 3.0] [0.0, 1.0], X [ 1, 5] and the design space discretized using 23

26 Table 2: Minimax D optimal designs for the logistic model on X [ 1, 5] and P [1.0, 3.0] [0.0, 1.0] when equally spaced grids of different sizes are used. x (-0.60,0.0542) (-0.56,0.1265) (-0.54,0.2190) (-0.55,0.0760) (-0.52,0.3066) (-0.52,0.2334) (-0.52,0.1421) (-0.54,0.2799) (0.44,0.0295) (0.48,0.1552) (0.50,0.1193) (0.48,0.2052) (0.52,0.2512) (0.52,0.1242) (0.52,0.1612) (0.49,0.0789) (1.48,0.1057) (1.52,0.2422) (1.52,0.0514) (1.53,0.1942) (1.56,0.2528) (1.56, ) (1.54,0.3070) (1.54,0.1658) p OPT [3.0, 0.0] T [3.0, 0.0] T [3.0, 0.0] T [3.0, 0.0] T CPU (s) Eff (x.xx, w.wwww) (design point, weight) equally spaced points with x = 0.02, presented in second line, first column of Table 1. The plot shows that for the minimax design ξ OPT the answering set is the singleton p OPT = [3.0, 0.0] T. Figure 2(a) displays the dispersion function of the SDP-generated design under the D-optimality criterion for the logistic model when X [ 1, 5] and the plausible region is P = [1.0, 3.0] [0.0, 1.0]. The dispersion functions of the designs found by our algorithm at termination are constructed after the convergence condition is attained. Briefly, it requires the determination of a probability measure in P that validates the conditions of the equivalence theorems using a LP formulation (Duarte and Wong, 2014). Practically, the construction of the dispersion function for minimax optimal designs is cumbersome because the need of finding a probability measure on the parameters domain before checking the General Equivalence Theorem on the design. The procedure becomes easier when the answering set is a singleton which is what we have, see Figure 1. The dispersion function is bounded above by 0 and peaks at the support points of the generated design. It follows that the dispersion function satisfies the conditions required in the equivalence theorem and so confirms the minimax D-optimality of the generated design. Similar dispersion functions can be obtained for all designs in Table 1 and for 24

27 Figure 1: Display of the objective function (det[m(ξopt, p)])1/n p vs. p for the optimal design obtained for P [1.0, 3.0] [0.0, 1.0], X [ 1, 5] and the design space discretized using equally spaced points with x = 0.02 (in second line, first column of Table 1). designs found by our algorithm under other criteria. For example, Figures 2(b) and 2(c) show, respectively, the dispersion functions of the A and E optimal designs in Table 3 found by our algorithm for the logistic model when P = [1.0, 3.0] [0.0, 1.0], x = 0.02 and X [ 1, 5]. In contrast to minimax D optimal designs, minimax A and E optimal designs have not been reported in the literature for the logistic model. Our proposed algorithm can be directly used to find such minimax optimal designs using the same discretization scheme. Table 3 displays selected minimax A and E optimal designs obtained when [βl, βu ] = [1.0, 3.0] and there are different plausible regions for µ. We observe from Tables 1 and 3 that the minimax A and E optimal designs have more support points than D optimal designs. Further, the computational times required by all the designs are all quite similar, and the differences are mainly due to the number of iterations required to reach convergence, as defined by the user-specified tolerance level, see equation (19). Overall, the proposed algorithm shows good flexibility and determines the various optimal designs without requiring a prohibitive amount of computational resources. The number of initial sampling points tends to have low impact on the conver25

Approximate and exact D optimal designs for 2 k factorial experiments for Generalized Linear Models via SOCP

Zuse Institute Berlin Takustr. 7 14195 Berlin Germany BELMIRO P.M. DUARTE, GUILLAUME SAGNOL Approximate and exact D optimal designs for 2 k factorial experiments for Generalized Linear Models via SOCP