D-optimal Designs for Factorial Experiments under Generalized Linear Models

Similar documents
Optimal Designs for 2 k Experiments with Binary Response

D-optimal Factorial Designs under Generalized Linear Models

OPTIMAL DESIGNS FOR 2 k FACTORIAL EXPERIMENTS WITH BINARY RESPONSE

Optimal Designs for 2 k Factorial Experiments with Binary Response

OPTIMAL DESIGNS FOR 2 k FACTORIAL EXPERIMENTS WITH BINARY RESPONSE

arxiv: v8 [math.st] 27 Jan 2015

Optimal Designs for 2 k Factorial Experiments with Binary Response

OPTIMAL DESIGNS FOR TWO-LEVEL FACTORIAL EXPERIMENTS WITH BINARY RESPONSE

D-optimal Designs for Multinomial Logistic Models

OPTIMAL DESIGNS FOR TWO-LEVEL FACTORIAL EXPERIMENTS WITH BINARY RESPONSE

OPTIMAL DESIGNS FOR TWO-LEVEL FACTORIAL EXPERIMENTS WITH BINARY RESPONSE

OPTIMAL DESIGNS FOR TWO-LEVEL FACTORIAL EXPERIMENTS WITH BINARY RESPONSE

D-OPTIMAL DESIGNS WITH ORDERED CATEGORICAL DATA

Outline of GLMs. Definitions

Generalized Linear Models Introduction

arxiv: v5 [math.st] 8 Nov 2017

STA216: Generalized Linear Models. Lecture 1. Review and Introduction

Generalized linear models

D-optimal Designs with Ordered Categorical Data

Linear model A linear model assumes Y X N(µ(X),σ 2 I), And IE(Y X) = µ(x) = X β, 2/52

Generalized Linear Models. Last time: Background & motivation for moving beyond linear

arxiv: v2 [math.st] 14 Aug 2018

LOGISTIC REGRESSION Joseph M. Hilbe

Introduction to General and Generalized Linear Models

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Linear Regression Models P8111

Generalized Linear Models

On construction of constrained optimum designs

Stat 5101 Lecture Notes

STA 216: GENERALIZED LINEAR MODELS. Lecture 1. Review and Introduction. Much of statistics is based on the assumption that random

Generalized Linear Models (GLZ)

Model Selection for Semiparametric Bayesian Models with Application to Overdispersion

Generalized Linear Models: An Introduction

AP-Optimum Designs for Minimizing the Average Variance and Probability-Based Optimality

Logistic Regression. Seungjin Choi

The Expectation-Maximization Algorithm

Econometrics I, Estimation

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Lior Wolf

A-optimal designs for generalized linear model with two parameters

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Repeated ordinal measurements: a generalised estimating equation approach

Latent Variable Models for Binary Data. Suppose that for a given vector of explanatory variables x, the latent

A Bayesian Mixture Model with Application to Typhoon Rainfall Predictions in Taipei, Taiwan 1

Stat 579: Generalized Linear Models and Extensions

Optimization Concepts and Applications in Engineering

Stat 710: Mathematical Statistics Lecture 12

Mark your answers ON THE EXAM ITSELF. If you are not sure of your answer you may wish to provide a brief explanation.

OPTIMAL DESIGNS FOR GENERALIZED LINEAR MODELS WITH MULTIPLE DESIGN VARIABLES

Two Useful Bounds for Variational Inference

Generalized Linear Models. Kurt Hornik

Lecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Stochastic Variational Inference

Generalized linear models

Bias-Variance Tradeoff

Machine Learning Lecture 5

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

LINEAR MODELS FOR CLASSIFICATION. J. Elder CSE 6390/PSYC 6225 Computational Modeling of Visual Perception

Weighted Least Squares I

Generative v. Discriminative classifiers Intuition

Linear Models in Machine Learning

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

Efficient computation of Bayesian optimal discriminating designs

Linear Methods for Prediction

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

Now consider the case where E(Y) = µ = Xβ and V (Y) = σ 2 G, where G is diagonal, but unknown.

Introduction to Machine Learning. Maximum Likelihood and Bayesian Inference. Lecturers: Eran Halperin, Yishay Mansour, Lior Wolf

Reading Group on Deep Learning Session 1

> DEPARTMENT OF MATHEMATICS AND COMPUTER SCIENCE GRAVIS 2016 BASEL. Logistic Regression. Pattern Recognition 2016 Sandro Schönborn University of Basel

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Parametric Models. Dr. Shuang LIANG. School of Software Engineering TongJi University Fall, 2012

Ph.D. Qualifying Exam Friday Saturday, January 3 4, 2014

Lecture 5: LDA and Logistic Regression

Credibility Theory for Generalized Linear and Mixed Models

Vasil Khalidov & Miles Hansard. C.M. Bishop s PRML: Chapter 5; Neural Networks

Regression with Numerical Optimization. Logistic

Computer Vision Group Prof. Daniel Cremers. 10a. Markov Chain Monte Carlo

Gaussian and Linear Discriminant Analysis; Multiclass Classification

STAT5044: Regression and Anova

POLI 8501 Introduction to Maximum Likelihood Estimation

Clustering K-means. Clustering images. Machine Learning CSE546 Carlos Guestrin University of Washington. November 4, 2014.

Approximate and exact D optimal designs for 2 k factorial experiments for Generalized Linear Models via SOCP

Machine Learning 2017

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.

Optimum designs for model. discrimination and estimation. in Binary Response Models

D-optimal designs for logistic regression in two variables

Sampling bias in logistic models

Statistics 580 Optimization Methods

Generalized Estimating Equations

P n. This is called the law of large numbers but it comes in two forms: Strong and Weak.

Hypothesis Testing. Econ 690. Purdue University. Justin L. Tobias (Purdue) Testing 1 / 33

Generative v. Discriminative classifiers Intuition

EM-algorithm for motif discovery

Logistic regression. 11 Nov Logistic regression (EPFL) Applied Statistics 11 Nov / 20

39th Annual ISMS Marketing Science Conference University of Southern California, June 8, 2017

Physics 403. Segev BenZvi. Numerical Methods, Maximum Likelihood, and Least Squares. Department of Physics and Astronomy University of Rochester

But if z is conditioned on, we need to model it:

ECLT 5810 Linear Regression and Logistic Regression for Classification. Prof. Wai Lam

Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields

Transcription:

D-optimal Designs for Factorial Experiments under Generalized Linear Models Jie Yang Department of Mathematics, Statistics, and Computer Science University of Illinois at Chicago Joint research with Abhyuday Mandal and Dibyen Majumdar October 20, 2012

1 Introduction A motivating example Preliminary setup 2 Locally D-optimal Designs Characterization of locally D-optimal designs Saturated designs Lift-one algorithm for searching locally D-optimal designs 3 EW D-optimal Designs Definition Examples 4 Robustness Robustness of misspecification of w Robustness of uniform design and EW design 5 Example Motivating example: revisited Conclusions

A motivating example: Hamada and Nelder (1997) Two-level factors: (A) poly-film thickness, (B) oil mixture ratio, (C) material of gloves, and (D) condition of metal blanks. Response: the windshield molding was good or not. Row A B C D Replicates good molding 1 + + + + 1000 338 2 + + 1000 826 3 + + 1000 350 4 + + 1000 647 5 + + 1000 917 6 + + 1000 977 7 + + 1000 953 8 1000 972 Question: Can we do something better?

Preliminary setup Consider an experiment with m fixed and distinct design points: x 1 x 2 X =. x m = x 11 x 12 x 1d x 21 x 22 x 2d x m1 x m2 x md For example, a 2 k factorial experiment with main-effects model implies m = 2 k, d = k + 1, x i1 = 1, and x i2,..., x id { 1, 1}. Exact design problem: Suppose n is given. Consider optimal n i s such that n i 0 and m i=1 n i = n. Approximate design problem: Let p i = n i /n. Consider optimal p i s such that p i 0 and m i=1 p i = 1.

Generalized linear model: single parameter Consider independent univariate responses Y 1,..., Y n : Y i f (y; θ i ) = exp{yb(θ i ) + c(θ i ) + d(y)} For example, { } exp y log θ 1 θ + log(1 θ), Bernoulli(θ) exp {y { log θ θ log y!}, } Poisson(θ) exp y 1 k 1 y θ k log θ + log Γ(k), Gamma(k, θ), fixed k > 0 { } exp y θ σ θ2 2 2σ y 2 2 2σ 1 2 2 log(2πσ2 ), N(θ, σ 2 ), fixed σ 2 > 0 Generalized linear model (McCullagh and Nelder 1989, Dobson 2008): link function g and parameters of interest β = (β 1,..., β d ), such that E(Y i ) = µ i and η i = g(µ i ) = x iβ.

Information matrix and D-optimal design Recall that there are m distinct predictor combinations x 1,..., x m with numbers of replicates n 1,..., n m, respectively. The maximum likelihood estimator of β has an asymptotic covariance matrix that is the inverse of the information matrix I = nx WX where X = (x 1,..., x m ) is an m d matrix, and W = diag(p 1 w 1,..., p m w m ) with w i = 1 var(y i ) ( µi η i ) 2. A D-optimal design is the p = (p 1,..., p m ) which maximizes with given w 1,..., w m. X WX

w as a function of β Suppose the link function g is one-to-one and differentiable. Suppose further µ i itself determines var(y i ). Then w i = ν(η i ) = ν (x i β) for some function ν. Binary response, logit link: ν(η) = 1 2+e η +e η. Poisson count, log link: w = ν(η) = exp{η}. Gamma response, reciprocal link: w = ν(η) = k/η 2. Normal response, identity link: w = ν(η) 1/σ 2.

w i = ν(η i ) = ν(x i β) for binary response ( ) 0.0 0.1 0.2 0.3 0.4 0.5 0.6 Logit Probit C Log Log Log Log 10 5 0 5 = x

Example: 2 2 experiment with main-effects model X = 1 1 1 1 1 +1 1 +1 1 1 +1 +1, W = The optimization problem maximizing where v i = 1/w i and X WX = 16w1 w 2 w 3 w 4 L(p), w 1 p 1 0 0 0 0 w 2 p 2 0 0 0 0 w 3 p 3 0 0 0 0 w 4 p 4 L(p) = v 4 p 1 p 2 p 3 + v 3 p 1 p 2 p 4 + v 2 p 1 p 3 p 4 + v 1 p 2 p 3 p 4

General case: order-d polynomial For general case, X is an m d matrix with distinct rows, W = diag(p 1 w 1,..., p m w m ). Based on González-Dávila, Dorta-Guerra and Ginebra (2007) and Yang, Mandal and Majumdar (2012b), we have Lemma Let X [i 1, i 2,..., i d ] be the d d sub-matrix consisting of the i 1 th,..., i d th rows of the design matrix X. Then f (p) = X WX = X [i 1,..., i d ] 2 p i1 w i1 p id w id. 1 i 1 < <i d m

Characterization of locally D-optimal designs For each i = 1,..., m, 0 x < 1, we define ( 1 x f i (x) = f p 1,..., 1 x p i 1, x, 1 x p i+1,..., 1 x ) p m 1 p i 1 p i 1 p i 1 p i = ax(1 x) d 1 + b(1 x) d If p i > 0, b = f i (0), a = f (p) b(1 p i ) d ; otherwise, b = f (p), a = f i ( 1 2) 2 d b. Theorem p i (1 p i ) d 1 Suppose f (p) > 0. Then p is D-optimal if and only if for each i = 1,..., m, one of the two conditions below is satisfied: (i) p i = 0 and f i ( 1 2) d+1 2 d f (p); (ii) 0 < p i 1 d and f i(0) = 1 p i d (1 p i ) d f (p).

Saturated designs Theorem Let I = {i 1,..., i d } {1,..., m} satisfying X [i 1,..., i d ] 0. Then p i1 = p i2 = = p id = 1 d is D-optimal if and only if for each i / I, X [{i} I \ {j}] 2 X [i 1, i 2,..., i d ] 2. w j w i j I 2 2 main-effects model: p 1 = p 2 = p 3 = 1/3 is D-optimal if and only if v 1 + v 2 + v 3 v 4, where v i = 1/w i. 2 3 main-effects model: p 1 = p 4 = p 6 = p 7 = 1/4 is D-optimal if and only if v 1 + v 4 + v 6 + v 7 4 min{v 2, v 3, v 5, v 8 }.

Saturated designs: more example 2 3 factorial design: Suppose the design matrix X = 1 1 1 1 1 1 0 2 1 1 1 1 1 1 1 1 1 1 0 2 1 1 1 1. p 1 = p 2 = p 3 = p 4 = 1/4 is D-optimal if and only if v 1 + v 2 + v 4 v 5 and v 1 + v 3 + v 4 v 6. p 2 = p 3 = p 4 = p 5 = 1/4 is D-optimal if and only if v 2 + v 4 + v 5 v 1 and v 2 + v 3 + v 5 v 6. p 3 = p 4 = p 5 = p 6 = 1/4 is D-optimal if and only if v 3 + v 4 + v 6 v 1 and v 3 + v 5 + v 6 v 2.

Saturation condition in terms of β: logit link, 2 2 design For fixed β 0, a pair (β 1, β 2 ) satisfies the saturation condition if and only if the corresponding point is above the curve labelled by β 0. β 2 0.0 0.5 1.0 1.5 2.0 2.5 3.0 β 0 = ± 0.5 β 0 = ± 1 β 0 = ± 2 0.0 0.5 1.0 1.5 2.0 2.5 3.0 β 1

Lift-one algorithm (Yang, Mandal and Majumdar, 2012b) Given X and w 1,..., w m, search for p = (p 1,..., p m ) maximizing f (p) = X WX : 1 Start with arbitrary p 0 = (p 1,..., p m ) satisfying 0 < p i < 1, i = 1,..., m and compute f (p 0 ). 2 Set up a random order of i going through {1, 2,..., m}. 3 ( For each i, determine f i (z). In this step, either f i (0) or f 1 i 2 needs to be calculated. 4 Define ( ) p (i) = 1 z, 1 p i p 1,..., 1 z 1 p i p i 1, z, 1 z 1 p i p i+1,..., 1 z 1 p i p 2 k where z maximizes f i (z), 0 z 1. Here f (p (i) ) = f i (z ). 5 Replace p 0 with p (i), f (p 0 ) with f (p (i) ). 6 Repeat 2 5 until convergence, that is, f (p 0 ) = f (p (i) ) for each i. )

Comments on lift-one algorithm The basic idea of the lift-one algorithm is that, for randomly chosen i {1,..., m}, we update p i to pi and all the other p j s to pj = p j 1 p i 1 p i. The major advantage of the lift-one algorithm is that in order to determine an optimal p i, we need to calculate X WX only once. This algorithm is motivated by the coordinate descent algorithm (Zangwill, 1969). It is also in spirit similar to the idea of one-point correction in the literature (Wynn, 1970; Fedorov, 1972; Müller, 2007), where design points are added/adjusted one by one.

Convergence of lift-one algorithm Modified lift-one algorithm: (1) For the 10mth iteration and a fixed order of i = 1,..., m we repeat steps 3 5. If p (i) is a better allocation found by the lift-one algorithm than the allocation p 0, instead of updating p 0 to p (i) immediately, we obtain p (i) for each i, and replace p 0 with the best p (i) only. (2) For iterations other than the 10mth, we follow the original lift-one algorithm update. Theorem When the lift-one algorithm or the modified lift-one algorithm converges, the converged allocation p maximizes X WX on the set of feasible allocations. Furthermore, the modified lift-one algorithm is guaranteed to converge.

Performance of lift-one algorithm: comparison Table: CPU time in seconds for 100 simulated β Algorithms Nelder-Mead quasi-newton conjugate simulated lift-one gradient annealing 2 2 Designs 0.77 0.41 4.68 46.53 0.17 2 3 Designs 46.75 63.18 925.5 1495 0.46 2 4 Designs 125.9 NA NA NA 1.45

Performance of lift-one algorithm: number of nonzero p i s Under 2 k main-effects model, binary response, logit link, β i s iid from Uniform(-3,3): k= 2 k= 3 k= 4 Density 0.0 0.2 0.4 0.6 Density 0.0 0.1 0.2 0.3 0.4 Density 0.00 0.10 0.20 0.30 1.0 1.5 2.0 2.5 3.0 3.5 4.0 number of support points 1 2 3 4 5 6 7 8 number of support points 5 10 15 number of support points k= 5 k= 6 k= 7 Density 0.00 0.10 0.20 0.30 Density 0.00 0.05 0.10 0.15 0.20 Density 0.00 0.04 0.08 0.12 0 5 10 15 20 25 30 number of support points 0 10 20 30 40 50 60 number of support points 0 20 40 60 80 100 number of support points

Performance of lift-one algorithm: time cost Under 2 k main-effects model, binary response, logit link, we simulate β i s iid from a uniform distribution and check the time cost in seconds by lift-one algorithm: (a) Time Cost on Average (b) Number of Nonzero p i 's on Average seconds 0.0 0.5 1.0 1.5 β ~ U( 0.5,0.5) β ~ U( 1, 1) β ~ U( 3, 3) number of nonzero p i 10 20 30 40 50 60 β ~ U( 0.5,0.5) β ~ U( 1, 1) β ~ U( 3, 3) 0 20 40 60 80 100 120 0 20 40 60 80 100 120 m=2 k m=2 k

EW D-optimal designs For experiments under generalized linear models, we may need to specify the β i s, which gives us the w i s, to get D-optimal designs, known as local D-optimality. An EW D-optimal design is an optimal allocation of p that maximizes X E(W )X. It is one of several alternatives suggested by Atkinson, Donev and Tobias (2007). EW D-optimal designs are often approximately as efficient as Bayesian D-optimal designs. EW D-optimal designs can be obtained easily using the lift-one algorithm. In general, EW D-optimal designs are robust.

EW D-optimal design and Bayesian D-optimal design Theorem For any given link function, if the regression coefficients β 0, β 1,..., β d are independent with finite expectation, and β 1,..., β d all have a symmetric distribution about 0 (not necessarily the same distribution), then the uniform design is an EW D-optimal design. A Bayes D-optimal design maximizes E(log X WX ) where the expectation is taken over the prior distribution of β i s. Note that, by Jensen s inequality, E ( log X WX ) log X E(W )X since log X WX is concave in w. Thus an EW D-optimal design maximizes an upper bound to the Bayesian D-optimality criterion.

Example: 2 2 main-effects model Suppose β 0, β 1, β 2 are independent, β 0 U( 1, 1), and β 1, β 2 U[0, 1). Under the logit link, the EW D-optimal design is p e = (0.239, 0.261, 0.261, 0.239). The Bayes optimal design, which maximizes φ(p) = E log X WX is p o = (0.235, 0.265, 0.265, 0.235). The relative efficiency of p e with respect to p o is { } φ(pe) φ(p o) exp 100% = 99.99% k + 1 for logit link, or 99.94% for probit link, 99.77% for log-log link, and 100.00% for complementary log-log link. The time cost for EW is 0.11 sec, while it is 5.45 secs for maximizing φ(p). It should also be noted that the relative efficiency of the uniform design p u = (1/4, 1/4, 1/4, 1/4) with respect to p o is 99.88% for logit link, and is 89.6% for complementary log-log link.

Example: 2 3 main-effects model Suppose β 0, β 1, β 2, β 3 are independent, and the experimenter has the following prior information for the parameters: β 0 U( 3, 3), and β 1, β 2, β 3 U[0, 3). For the logit link the uniform design is not EW D-optimal. In this case, EW solution is p e = (0, 1/6, 1/6, 1/6, 1/6, 1/6, 1/6, 0), while p o = (0.004, 0.165, 0.166, 0.165, 0.165, 0.166, 0.165, 0.004) which maximizes φ(p). The relative efficiency of p e with respect to p o is 99.98%. On the other hand, the relative efficiency of the uniform design with respect to p o is 94.39%. It takes about 2.39 seconds to find an EW solution while it takes 121.73 seconds to find p o. The difference in computational time is even more prominent for 2 4 case (24 seconds versus 3147 seconds).

Robustness for misspecification of w Denote the D-criterion value as ψ(p, w) = X WX for given w = (w 1,..., w m ) and p = (p 1,..., p m ). Define the relative loss of efficiency of p with respect to w as ( ) 1 ψ(p, w) d R(p, w) = 1, ψ(p w, w) where p w is a D-optimal allocation with respect to w. Define the maximum relative loss of efficiency of a given design p with respect to a specified region W of w by R max (p) = max R(p, w). w W

Robustness of uniform design: 2 2 main-effects model Yang, Mandal and Majumdar (2012a) showed that under 2 2 experiment with main-effects model, if w i [a, b], i = 1, 2, 3, 4, 0 < a b, then the uniform design p u = (1/4, 1/4, 1/4, 1/4) is the most robust one in terms of the maximum of relative loss of efficiency. On the other hand, if the experimenter has some prior knowledge about the model parameters, for example, if one believes that β 0 Uniform( 1, 1), β 1, β 2 Uniform[0, 1) and β 0, β 1, β 2 are independent, then the theoretical R max of uniform design is 0.134, while the theoretical R max of design given by p = (0.19, 0.31, 0.31, 0.19) is 0.116. That is, uniform design may not be the most robust one.

Robustness of uniform design: 2 4 main-effects model Simulate β 0,..., β 4 for 1000 times and calculate the corresponding w s. For each w s, we obtain a D-optimal allocation p s. Percentages β 0 U( 3, 3) U( 1, 1) U( 3, 0) N(0, 5) β 1 U( 1, 1) U(0, 1) U(1, 3) N(0, 1) β 2 U( 1, 1) U(0, 1) U(1, 3) N(2, 1) β 3 U( 1, 1) U(0, 1) U( 3, 1) N(.5, 2) β 4 U( 1, 1) U(0, 1) U( 3, 1) N(.5, 2) (I) (II) (III) (I) (II) (III) (I) (II) (III) (I) (II) (III) R 99.35.35.35.15.11.11.50.27.30.65.86.73 R 95.30.30.30.13.09.09.50.25.26.62.79.67 R 90.27.27.27.12.08.09.49.24.23.59.74.63 Note: (I) = R 100α (p u ), uniform design; (II) = min 100α(p s ), best among 1000; 1 s 1000 (III) = R 100α (p e ), EW design.

Windshield experiment: revisited (Yang, Mandal and Majumdar, 2012b) Based on the data presented in Hamada and Nelder (1997), ˆβ = (1.77, 1.57, 0.13, 0.80, 0.14) under logit link. The efficiency of the original 2 4 1 III design p HN is 78% of the locally D-optimal design if ˆβ were the true value. It might be reasonable to consider an initial guess of β = (2, 1.5, 0.1, 1, 0.1). This will lead to the locally D-optimal half-fractional design p a with relative efficiency 99%. Another reasonable option is to consider a range, for example, β 0 Unif(1, 3), β 1 Unif( 3, 1), β 2, β 4 Unif( 0.5, 0.5), and β 3 Unif( 1, 0), the relative efficiency of the EW D-optimal half-fractional design p e is 98%.

Windshield experiment: optimal half-fractional design Row A B C D η π p HN p a p e 5 +1 1 +1 +1-0.87 0.295 0.044 0.184 1 +1 +1 +1 +1-0.61 0.352 0.125 0.178 0.011 6 +1 1 +1 1-0.59 0.357 0.125 0.178 0.011 2 +1 +1 +1 1-0.33 0.418 0.059 0.184 7 +1 1 1 +1 0.73 0.675 0.125 0.163 3 +1 +1 1 +1 0.99 0.729 0.195 8 +1 1 1 1 1.01 0.733 0.195 4 +1 +1 1 1 1.27 0.781 0.125 0.147 13 1 1 +1 +1 2.27 0.906 0.125 0.158 0.111 9 1 +1 +1 +1 2.53 0.926 14 1 1 +1 1 2.55 0.928 10 1 +1 +1 1 2.81 0.943 0.125 0.074 0.110 15 1 1 1 +1 3.87 0.980 11 1 +1 1 +1 4.13 0.984 0.125 16 1 1 1 1 4.15 0.984 0.125 12 1 +1 1 1 4.41 0.988

Conclusions We consider the problem of obtaining locally D-optimal designs for experiments with fixed finite set of design points under generalized linear models. We obtain a characterization for a design to be locally D-optimal. Based on this characterization, we develop efficient numerical techniques to search for locally D-optimal designs. We suggest the use of EW D-optimal designs. These are much easier to compute and still highly efficient compared with Bayesian D-optimal designs. We investigate the properties of fractional factorial designs (not presented here, see Yang, Mandal and Majumdar, 2012b). We also study the robustness of the D-optimal designs.

References 1 Hamada, M. and Nelder, J. A. (1997). Generalized linear models for quality-improvement experiments, Journal of Quality Technology, 29, 292 304. 2 McCullagh, P. and Nelder, J. (1989). Generalized Linear Models, Second Edition. 3 Dobson, A.J. (2008). An Introduction to Generalized Linear Models, 3rd edition, Chapman and Hall/CRC. 4 González-Dávila, E., Dorta-Guerra, R. and Ginebra, J. (2007). On the information in two-level experiments, Model Assisted Statistics and Applications, 2, 173 187. 5 J. Yang, A. Mandal, and D. Majumdar (2012a). Optimal designs for two-level factorial experiments with binary response, Statistica Sinica, Vol. 22, No. 2, 885 907. 6 J. Yang, A. Mandal, and D. Majumdar (2012b). Optimal designs for 2 k factorial experiments with binary response, submitted for publication. Available at http://arxiv.org/pdf/1109.5320v3.pdf

More references 1 Zangwill, W. (1969), Nonlinear Programming: A Unified Approach, Prentice-Hall, New Jersey. 2 Wynn, H.P. (1970). The sequential generation of D-optimum experimental designs, Annals of Mathematical Statistics, 41, 1655 1664. 3 Fedorov, V.V. (1972). Theory of Optimal Experiments, Academic Press, New York. 4 Müller, W. G. (2007). Collecting Spatial Data: Optimum Design of Experiments for Random Fields, 3rd Edition, Spinger. 5 Atkinson, A. C., Donev, A. N. and Tobias, R. D. (2007). Optimum Experimental Designs, with SAS, Oxford University Press.