Designs for weighted least squares regression, with estimated weights

Size: px

Start display at page:

Download "Designs for weighted least squares regression, with estimated weights"

Myra Chase
5 years ago
Views:

1 Stat Comput 13 3: DOI 1.17/s Designs for weighted least squares regression, with estimated weights Douglas P. Wiens Received: 6 July 11 / Accepted: February 1 / Published online: February 1 Springer Science+Business Media, LLC 1 Abstract We study designs, optimal up to and including terms that are On, for weighted least squares regression, when the weights are intended to be inversely proportional to the variances but are estimated with random error. We take a finite, but arbitrarily large, design space from which the support points are to be chosen, and obtain the optimal proportions of observations to be assigned to each point. Specific examples of D- and I-optimal design for polynomial responses are studied. In some cases the same designs that are optimal under homoscedasticity remain so for a range of variance functions; in others there tend to be more support points than are required in the homoscedastic case. We also exhibit minimax designs, that minimize the maximum, over finite classes of variance functions, value of the loss. hese also tend to have more support points, often resulting from the breaking down of replicates into clusters. Keywords Linear programming Minimax Optimal design Polynomial regression 1 Introduction and summary Suppose that an experimenter makes n independent observations on random variables r.v.s Y at several locations x 1,...,x N. hose made at location x i are denoted {Y ij } n i so that n = N n i and have means x i β and variances σi = σ x i. We specify only that n i there is no requirement that observations be made at all locations, and D.P. Wiens Department of Mathematical and Statistical Sciences, University of Alberta, Edmonton, Alberta, Canada doug.wiens@ualberta.ca typically many of the n i will be zero. Consider the weighted least squares estimate of β, with weights w i > onx i and on Y ij. his estimate is, assuming the existence of the inverse, ˆβ = X W X X W ȳ, where X N p = x 1,...,x N, W N N = diagw 1,...,w N, N N = diag 1,..., N, i = n i /n, and ȳ is the N 1 vector of averages with elements ȳ i = n i Y ij /n i for definiteness, define ȳ i = when n i =. he p p covariance matrix of the normalized estimates n ˆβ is, with = diagσ1,...,σ N,given by C = X W X X W WX X W X. 1 he weights are typically computed from estimated variances and then C is random; we measure the loss by the expected value E[ C] of some scalar-valued function of this matrix. Possibilities upon which we shall concentrate are D C = logdet C, the logarithm of the generalized variance, and I C = tr XCX, the sum, over the design space, of the prediction variances. he problem addressed here is to find a design, i.e. a choice of, so as to minimize the loss in the face of uncertainty about the variances and errors in their estimation. We will assume that the target weights are the inverses of the variances since these result in regression estimates with maximum efficiency and that the variances will be estimated, once the data are gathered, through the n i observations made at x i. We entertain a scenario under which the target values are missed because of multiplicative, positive

2 39 Stat Comput 13 3: random error: w i = 1ˆσ = 1 Z i σi i, for positive r.v.s Z i.his results in the probability model log w i = log σ i + η i. he r.v.s η i = log Z i are assumed to be independent, with zero means and variances ση /n i. he problem of weighting heteroscedastic data in a regression is commonly treated strictly as an estimation problem. Fuller and Rao 1978, Carroll and Ruppert 198, Shao 199 and Hooper 1993 made a variety of proposals for this, all with the aim of finding weights inversely proportional to the variances. reating this as a design problem, Dette et al. 5 obtained optimal designs, minimax for certain classes of variance functions assumed to be estimated without error. Wiens 1998 and Fang and Wiens proposed both weights and designs with minimax properties. he maxima were evaluated over classes of departures from both homoscedasticity and from the fitted regression response. In each case the minimax weights depended in a rather involved manner on the design weights and on the least favourable variances. Wiens obtained designs and weights for homoscedastic regression models, with possibly misspecified response functions. hese were required to minimize a function of the covariance matrix of the regression estimates, under a side condition of unbiasedness. he resulting weights could roughly be described as being inversely proportional to the norms of the vectors of regressors. None of the papers detailed above explicitly addresses the errors caused by the estimation of the variances; it is the purpose of the current article to fill this gap in the literature. We envision that the experimenter will choose a design, gather the data, and then estimate the variances and carry out a final, weighted least squares regression. Our model allows for the estimation of the variances to be done in a variety of ways. We first obtain designs optimal when the {σi } are correctly specified. his restriction is tempered by the fact that the optimal designs seem to vary slowly with the variance functions. Nonetheless, we present as well examples in which the designs are i simultaneously optimal over a continuum of variance functions Example 3, and ii minimax optimal, over a finite class of variance functions Example 6. In the next section we give an expansion of E[ C] in powers of n /, and show that the On / term vanishes. his derivation, and some others that are long or less central to the development, are in Appendix. We then approximate the loss by the sum of the constant and On terms, and seek to choose { i } in order to minimize this approximate loss. At this point the requirement that i be an integral multiple of n is dropped, thus yielding continuous designs that must be approximated, in a manner described in Sect. 4, by implementable exact designs. In Sect. 3 we present some analytic characterizations of optimal designs. We present a necessary and sufficient first order condition for a design to furnish a local minimum of the loss, and go on to investigate global optimality. he theory is illustrated through some simple examples; the conditions are however sufficiently complex that our more comprehensive designs are studied numerically. Section 4 contains examples of optimal designs in the case of polynomial regression, illustrating how these change with ση /n and with the structure of {σ i }. A message to be gleaned from the theory and examples presented here is that, relative to designs for homoscedastic regression models, those for estimated heteroscedasticity should have more design points, especially in regions of small anticipated variability, and that some of the replicates should be replaced by clusters of observations in nearby but distinct locations. his last point has often also been made in studies of designs which are to be robust against misspecified responses, and so the designs presented here can be expected to inherit some robustness in the latter situations. Expansions of the loss functions We make the following assumptions: A1 he function possesses two continuous derivatives with respect to each element of its argument. A he expectations E[wi t ] exist in an open neighbourhood of t =. Assumption A, together with w i = e η i /σi, ensures that all η i have moment generating functions, hence all moments are finite. We will write R n resp., r n generically, to denote a matrix resp., a scalar that is o p n and for which the expectation is on. We seek an expansion of the form C = C + 1 C n n C + R n, 3 with C being non-random, and E[C 1 ]=. It will in fact turn out that C 1 =. he assumptions on then allow the expansion [ 1 C = C + tr C n C ] n C with + 1 n vec C 1 C vecc 1 + r n, E [ C ] = C + 1 n { tr C E[C ] + 1 tr C COV[ vecc1 ]} + r n.

3 Stat Comput 13 3: Here vec refers to the concatenation of the columns of a matrix, C is the p p matrix with i, jth element C/ c ij and C is the p p Hessian matrix. We then approximate E[ C] by L = C + 1 { tr C E[C ] n S x Cxdx = tr[ca], with A = S xx dx replacing X X = N x i x i in I C and I C, and little anticipated effect on the resulting designs. 3 Design optimality + 1 tr C COV[ vecc1 ]}, 4 where = 1,..., N. he following expansion of L is derived in Appendix. heorem 1 With notation as above, and with U = / X = u 1,...,u N and τ = ση /n, we have that C = for = U U, and that the loss 4 is L = u i + τ N 1 i u i he particular cases noted above are D C = logdet C, with D C = C and L D = logdet + τ 1 i u i. I C = trxcx, with I C = X X and L I = tr X X Remarks + τ 1 i u i u i ; 6 X he parameter τ introduced in heorem 1 can be chosen by the designer without a knowledge of ση, to express the faith that he is willing to place in the accuracy of his variance estimates. If τ = then he designs in anticipation of estimating these variances, and hence the efficient weights, perfectly. In this case only the leading term in 5 is to be minimized by the design. Larger values of τ correspond to a more conservative approach and a reliance on the second term in the expansion 5.. A reviewer has asked about the possible impact of altering the I-criterion by replacing the sum of the prediction variances by their integral over a continuous version of the design space the convex hull of the set {x i } N would be one possibility. Denoting this continuous extension by S, wewouldhave I C = We say that a design is locally optimal if it furnishes a local minimum of the loss L, and globally optimal if it furnishes a global minimum. he conditions for local optimality are thus weaker than those for global optimality, but are correspondingly much easier to verify. In some cases the locally optimal designs found in this way also turn out to be globally optimal. In other cases we have found locally optimal designs but have been unable to answer the question of global optimality. he various Equivalence theorems in the literature offer little guidance here, since they apply at most only to the leading term of our expansion 4. In this section we study these notions, restricting to the cases of D- and I-optimality. 3.1 Local optimality he following theorem gives an easily checked condition to verify that a proposed minimizer furnishes at least a local minimum of the loss. First consider L D at 6. We show in Appendix that the gradient L D def = c is the N 1 vector with elements c,i = q,i τ { q,i + Q ii Q } D q Q ii, i = 1,...,N, 8 where Q = U U = U U U U, and D q = diagq,1,...,q,n with q,i = u i.we write and D q and later D r to denote evaluation at. heorem In order that furnish a local minimum of L D, it is necessary and sufficient to have c,i k def = c, i = 1,...,N, with equality on the support S ={i,i > }. he value of the constant is k = p τ tr[q D q D q ] [ N ] = p τ q,j q,j,j. 9

4 394 Stat Comput 13 3: Proof Suppose first that furnishes a local minimum. hen it is necessary that the directional derivative, in the direction of any distribution 1, be non-negative: d dt L D 1 t + t 1 t= = c 1 = c,i k 1,i. 1 Since 1 is arbitrary, 1 requires c,i k for i = 1,...,N. hen putting 1 = in 1 yields that c,i = k on S. hat k isgivenby9 is a straightforward calculation. Conversely, if the conditions of the theorem hold, then the inequality in 1 is immediate, and so furnishes a local minimum. For I-optimality we require further definitions. Let R = U X X U, with diagonal elements r,i = u i X X. Set D r = diagr,1,...,r,n. he gradient L I def = c again derived in Appendix is the N 1 vector with elements c,i = r,i τ { q,i r,i + R Q ii R D q Q ii Q D r Q ii }, i = 1,...,N. 11 he proof of the following theorem is identical to that of heorem, although the final expressions are less amenable to simplification. heorem 3 In order that furnish a local minimum of L I, it is necessary and sufficient to have c,i k def = c, i = 1,...,N, with equality on the support S ={i,i > }. he value of the constant is [ k = tr X X N τ,j q,j r,j + trr trr D q Q tr Q D r Q ]. 1 Example 1 Suppose that p = 1, i.e. that there is no intercept and only one independent variable. We verify that the design that places all mass where x i /σ i is a maximum satisfies the conditions of heorems and 3, hence is locally both D- and I-optimal. For this put = x i /σ i and s i = u i, and order these as s 1 s N. he support of is S ={i s i = s N }. We calculate from 8 and 9 that { c,i = s i si τ s i s N s N sn { k = τ + 1 s j }, s N and then c,i k = s N s i s N s j }, [ { }] 1 + τ N s i + s j, s N with equality on S, so that heorem applies. Similarly, from 11 and 1, [ c,i = x sn s i [ k = x sn s N }] 3sN, τ s N τ s N { N si + s j s i 3s i s N }], { N sn + s j s N and then [ { c,i k = s N s i sn x 1 + τ s i + s N, N }] s j with equality on S. We revisit this example in the next section, and show that this design is in fact globally D- and I-optimal. Example o illustrate the satisfaction of the conditions of heorem, we use the methods described in Sect. 4 below to obtain a locally D-optimal design for quartic regression. he independent variable x takes on values uniformly spaced over [, 1]: x i = + i 1, i = 1,...,N, 13 N 1 with N = 1. he variance function is σ i 1 + x i d, 14 with d = 1. he MALAB code is available from us. We take τ = 1 and obtain the design given in the left plot of Fig. 1. In the right plot we superimpose the scaled values of c,i k, illustrating the satisfaction of the conditions of heorem.

5 Stat Comput 13 3: Fig. 1 D-optimal design,i plotted against x i for quartic regression together with values of c,i k as described in Example Example 3 Suppose that one anticipates fitting a straight line response with an intercept, with possible design points = x 1 <x < <x N <x N = 1. Assume that these and the variances are symmetric: x N i+1 = x i and σ N i+1 = σ i. With constant variances, the D- and I-optimal design places half of its mass at each of x =±1. Under what conditions on {σ i } does this design continue to satisfy the conditions of heorems or 3 for local D- or I- optimality? For this design = 1 δ x δ x N we have, with ν i = σi /σ N, that q,i = 1 + xi /ν i and c,i k = q,i + τm i, for M i = 1 + x i xj q,j ν i ν j q,i. Note that c,i k = fori = 1,N; thus if q,i, i.e. ν i 1 + x i, i =,...,N 1, 15 then the conditions of heorem hold for τ τ D, where { min τ D = {i Mi <}, otherwise. q,i M i, if {i M i < }, We note that the variance functions 14 satisfy 15for d. Similarly, [ ] c,i k = σn N 1 1νi + x 1 x i + τ M i, ν i for M i = N q,i x 1 x i ν i ν j ν i x j q,i + 4 ; ν j here x = x 1,...,x N. hus if N 1 1νi + x 1 x i, i =,...,N 1, ν i then the conditions of heorem 3 hold for τ τ I, where τ I = min N1 ν 1 + x 1 x i i ν i {i M i <}, if {i M M i i < },, otherwise. See Fig. for plots of τ D and τ I vs. d, when the design space and variance functions are as at 13, with N = 1, and 14 respectively. 3. lobal optimality Recall the definitions of and from heorem 1. We can view a design as a probability distribution, and then if P u = = i we have that = i u [ i = E uu ]. 16 hus, define to be the set of all N-point distributions, and define to be the set of positive definite matrices that arise as in 16: = { > and = E [ uu ] for some }. Note that is a convex subset of the set of symmetric p p matrices. For let = be any distribution for which = E [uu ], and define ={ },

6 396 Stat Comput 13 3: Fig. Values of τ D left; = for d<.43 and τ I right; = for d<.156 versus d for variances given by 14and N = 1. he two-point design.5δ +.5δ +1 is D-optimal for τ [,τ D ] and I-optimal for τ [,τ I ].Whend>1. resp., d>.45 the interval [,τ D ] resp., [,τ I ]isempty a convex subset of. It is easily verified that, for any, L D inf inf L D, and so a path to the solution is to minimize in stages first over for fixed, then over. From6 it is seen that the first stage in this process is equivalent to maximizing E [u u ]; this is continued in the following theorem. heorem 4 With notation as above, set ν D = max E [u u ] for, and c i = u i. hen ν D = min M max,...,n { ci + tr M u } i, 17 where the minimum is over symmetric, p p matrices M. Let { [ D = arg min logdet + τ tr U U ] τν D }, 18 assuming that the minimum is attained. hen the minimum value of L D is L D D = logdet D + τ tr [ U ] D U τν D D, 19 and the minimizing D is any design maximizing E [u D u ] within the class of designs satisfying E [uu ]= D. Proof For fixed =, wehave L D = logdet + τ tr [ U U ] τe [ u u ], and we first seek to minimize this over.hisisalinear programming problem. Let vec S denote the pp + 1/ 1 vector consisting of those elements g jk, that appear in vec, but with j k, i.e. the columnwise representation of the upper triangle of the symmetric matrix. Expressing the constraints = E [uu ], i = 1, i in linear programming notation, the linear program becomes maximize c subject to V = d and N 1, where p pp + 1 = + 1, 1a W pp+1 V p N = N for 1b 1 N W = vec S u1 u 1... vecs un u N, pp+1 d p 1 = 1 and 1 c N 1 = c 1,...,c N. 1c 1d he set of feasible solutions is non-empty by the definition of and then standard linear programming theory ensures that the maximum is attained at an extreme point of the convex set generated by the set of feasible solutions. Since this set is bounded 1for the maximum is finite. he dual problem is to find μ p 1 to minimize d μ subject to V μ c. Equivalently, minimize μ p subject to μ p + μ 1,...,μpp+1 vec S ui u i c j for i = 1,...,N.

7 Stat Comput 13 3: By the Duality heorem ass 1975, p. 119 the solutions to these problems have the same extrema: ν D = μ p. 3 o analyze this, first relabel μ 1,...,μpp+1 as μ 11 ; μ 1, μ ; ;μ 1p,μ p,...,μ pp, and define a symmetric matrix M by μ jk /, j <k, m jk = μ jj, j = k, μ kj / j>k. hen L I I = tr X I X + τ tr [ X I U U I X ] τν I I, and the minimizing I is any design maximizing E [u uu X X u] within the class of designs satisfying E [uu ]= I. Remark In some cases the continuation of Example 1 below, for instance the condition E [uu ]= D determines the optimizing D uniquely. If there is more than one design satisfying this condition, then the required optimizer is the solution to the linear program described by and 1. For I-optimality, c i in 1d is replaced by c i. μ 1,...,μpp+1 vec S uj u j p = m jk ui u i jk j,k=1 = tr M u i, and becomes minimize μ p { subject to μ p max ci + tr M u i,...,n hus μ p is the minimum, over symmetric matrices M, of this maximum; using 3 as well we obtain 17. Now min L D = logdet + τ tr [ U U ] τν D, and so if the minimum in 18 is attained we immediately have 19. For L I at 7 we obtain the following, whose proof is essentially identical to that of heorem 4 and so is omitted. heorem 5 With notation as above, set ν I = max E [u uu X X u] for, and c i = u i u i X X. hen ν I = min M max,...,n { ci + tr M u } i, where the minimum is over symmetric, p p matrices M. Let { I = arg min tr X X + τ tr [ X U U X ] τν I }, assuming that the minimum is attained. hen the minimum value of L I is }. Example 1 continued Assume that s 1 >, so that = {g g [s 1,s N ]}. his assumption does not affect the end result, but simplifies the discussion. For g we find that ν D g = min m max,...,n { s } i g ms i + mg. he maximum of the quadratic in s is attained at one of the extremes: { s } ν D g = min max 1 m g +mg s 1, s N g ms N g. his maximum of two linear functions of m one increasing, the other decreasing is minimized at the point of intersection: m = s N + s 1 g, with ν D g = s N + s 1 g s N s 1 g. hen 18 becomes g D = arg min g [s 1,s N ] logg + τ g [ N i= ] s i + s Ns 1. g he function being minimized is a decreasing function of g, minimized at g = s N. he only design attaining E [S]= s N is that placing all mass at s N, and so this design is globally rather than merely locally D-optimal. For I-optimality we instead find that { s ν I g = min max 1 m g 3 x + mg s 1, sn } g 3 x ms n g = x g ν Dg,

8 398 Stat Comput 13 3: able 1 Designs for straight line regression, as in Example 4. Support points x i together with D- and I-optimal design frequencies n i D, n i I x i d = d = d = 4 n i D n i I n i D n i I n i D n i I τ = τ = τ = g I = arg min g [s 1,s N ] x g [ N + τ x g i= ] s i + s Ns 1. g he minimum is again attained at g = s N, and so the design placing all mass at s N is also globally I-optimal. 4 Examples: polynomial regression We have minimized the loss functions L D and L I, concentrating on polynomial regression with design space given by 13 with N = 1. hus the jth column of X N p is x j 1,...,x j N for j = 1,...,p, and the degree of the polynomial to be fitted is q = p 1. Although the theory of Sect. 3. gives insight into the structure of the solutions, it falls short of being conveniently implemented numerically. hus the minimization of 6 and 7 was carried out directly, using nonlinear constrained minimization routines in MALAB. he specific examples detailed here have assumed the variance functions 14 and a design of size n = 1. he designs are not affected by the constant of proportionality, which we choose such that N σi = 1. In each case the exact, minimizing values { i } are obtained, and then integer allocations n i n i are obtained using the efficient design apportionment of Pukelsheim his is a rounding procedure with, among others, the property of sample size monotonicity if a new point is to be allocated to an existing design, then none of the current allocations will be able Designs for cubic regression, as in Example 5. Support points x i together with D- and I-optimal design frequencies n i D, n i I x i d = d = d = 4 n i D n i I n i D n i I n i D n i I τ = τ = τ = reduced. Although the { i } are symmetric N i+1 = i the rounding sometimes results in slight asymmetries. Example 4 covers straight line regression. Designs for cubic regression are given in Example 5. In Example 6 we present minimax designs, which minimize the maximum loss, with the maximum evaluated over a range of values of d in 14. Example 4 Here we take q = 1 straight line regression. Designs minimizing L D and L I are detailed in able 1, for the combinations of τ =,.5, 1 and d =,, 4. For each pair τ, d the D- and I-optimal designs turn out to be supported on the same points. When d = they coincide with the variance minimizing design under homoscedasticity. For d> they differ, in manners depending on the type of optimality and on τ ; in particular they are typically supported on three points rather than two. Note also that for large d the masses at ±1 move towards the centre. Example 5 Here we take q = 3. See able. When τ = d = the D-optimal design seems to be attempting an approximation, in our discrete design space, of the D- optimal design on the continuous interval [, 1]; this continuous design places mass of.5 at each of the points ±1,

9 Stat Comput 13 3: able 3 Minimax designs for straight line q = 1 and cubic q = 3 regression, as in Example 6. Support points x i together with D- and I-optimal design frequencies n i and least favourable powers d x i q = 1 q = 3 D I D I n i d n i d n i d n i d τ = τ = τ = ± Acknowledgements his research has been supported by the Natural Sciences and Engineering Research Council of Canada, and has benefited from the incisive comments of three anonymous reviewers. Appendix: Derivations Proof of heorem 1 he probability model can be written as W = E, where E = diage i with e i = e η i. hen 1 becomes C = U E U U E EU U E U. A.1 Let d i = n i η i = n i η i and note that the d i are independent, with zero mean and unit variance. With D = diagd i we have E = I + 1 / D + 1 n n D + R n, where R n = diagr in and r in n 3/ i di 3 /6. By A, R n = o p n and E[R n ]=on. Inserting these expressions into the components of A.1 gives U E U def = = + 1 n n + R n = U U, 1 = U 1/ DU, = 1 U D U, with = / for I + 1 / n 1 / + 1 n / / ±.447. A similar comment applies to the I-optimal design when τ = d =. In each case, increasing d results in a migration of mass towards the centre of the design space, in a manner that varies with τ. Example 6 For the values of q and τ used in Examples 4 and 5 we have obtained designs that minimize the maximum loss, as the power d in 14 varies over an 81 point grid spanning [ 4, 4]. See able 3. A notable feature of the minimax designs is that they tend to have more support points than do the designs of Examples 4 and 5, when these are evaluated at approximately the powers d that are least favourable for the minimax designs. As we have often noticed in other studies, the additional robustness of the minimax designs tends to be achieved by distributing some replicates into clusters of observations at nearby locations. and + / R n / / = 1 n { n 1 1 } + R n, U E EU def = H = + n n + R n. Substituting these expressions into C = H and expanding gives 3 with C =, C 1 =, C = 1 1.

10 4 Stat Comput 13 3: Substituting into 4gives L = C + 1 n tr C E[ 1 ] 1. o compute the expectation we note that E[D ]=ση I, and that E[DUMU D]=ση diag M for any non-random, p p matrix M. hese observations yield E [ 1 ] 1 = σ η U diag u i ] [ 1 E[ ]=E U D U U, = σ η U U, whence L = C + σ η n tr diag 1 i u i u i U which is 5. C U, Verification of 8 Define φβ,c = logdet C + τ hen N u i C τ β i u. i C τu D q U } vec u 1 u 1 vec u N u N, with L D N 1 = τ q,1. q,n u 1 u 1. u N u N vec { + τu U τu D q U }. his vector has ith element c,i = τq,i [ + τ τ U D q U ] ui, which reduces to 8. Verification of 11 Define φβ,c = tr XCX + τ hen τ U U XC β i u i C XCui. L D = φ,, with gradient φ given by L φ D = β β,c=, β,c=, We calculate that φ = τ β φ β,c=, β,c=, = + φ u i e i = τ q,1,...,q,n ;. β,c=, = vec { + τu U τu } D q U ; = vec u 1 u 1 vec u N u N. Here the Kronecker product is as in Srivastava and Khatri 1979, viz., A B = a ij B i,j. hus L D = τ q,1,...,q,n vec { + τu U L I = φ,, with gradient φ given by L I = φ β β,c=, β,c=, We calculate that φ = τ β φ hus β,c=, + φ u i. β,c=, X = τq,1 r,1,,q,n r,n ; e i = vec { X X + τx X U U L I = τq,1r,1,...,q,n r,n τx X U D q U τu D r U }. vec { X X + τx X U U τx X U D q U τu D r U } vec u1 u 1 vec u Nu N,

11 Stat Comput 13 3: and we then calculate that L I has ith element c,i = τq,i r,i u [ i X X + τx X U U τx X U D q U τu D r U ], which reduces to 11. References Carroll, R.J., Ruppert, D.: Robust estimation in heteroscedastic linear models. Ann. Stat. 1, Dette, H., Haines, L.M., Imhof, L.A.: Bayesian and maximin optimal designs for heteroscedastic regression models. Can. J. Stat. 33, Fang, Z., Wiens, D.P.: Integer-valued, minimax robust designs for estimation and extrapolation in heteroscedastic, approximately linear models. J. Am. Stat. Assoc. 95, Fuller, W.A., Rao, J.N.K.: Estimation for a linear model with unknown diagonal covariance matrix. Ann. Stat. 6, ass, S.I.: Linear Programming: Methods and Applications. Mcraw- Hill, New York 1975 Hooper, P.M.: Iterative weighted least squares estimation in heteroscedastic linear models. J. Am. Stat. Assoc. 88, Pukelsheim, F.: Optimal Design of Experiments. Wiley, New York 1993 Shao, J.: Empirical Bayes estimation of heteroscedastic variances. J. Am. Stat. Assoc. 88, Srivastava, M.S., Khatri, C..: An Introduction to Multivariate Statistics. North-Holland, New York 1979 Wiens, D.P.: Minimax robust designs and weights for approximately specified regression models with heteroscedastic errors. Stat. Sin., Wiens, D.P.: Robust weights and designs for biased regression models: least squares and generalized M-estimation. J. Stat. Plan. Inference 83,

Joint Statistical Meetings - Section on Statistics & the Environment

Joint Statistical Meetings - Section on Statistics & the Environment Robust Designs for Approximate Regression Models With Correlated Errors Douglas P. Wiens Department of Mathematical and tatistical ciences University of Alberta, Edmonton, Alberta, Canada T6G G1 doug.wiens@ualberta.ca