Designs for weighted least squares regression, with estimated weights

Size: px
Start display at page:

Download "Designs for weighted least squares regression, with estimated weights"

Transcription

1 Stat Comput 13 3: DOI 1.17/s Designs for weighted least squares regression, with estimated weights Douglas P. Wiens Received: 6 July 11 / Accepted: February 1 / Published online: February 1 Springer Science+Business Media, LLC 1 Abstract We study designs, optimal up to and including terms that are On, for weighted least squares regression, when the weights are intended to be inversely proportional to the variances but are estimated with random error. We take a finite, but arbitrarily large, design space from which the support points are to be chosen, and obtain the optimal proportions of observations to be assigned to each point. Specific examples of D- and I-optimal design for polynomial responses are studied. In some cases the same designs that are optimal under homoscedasticity remain so for a range of variance functions; in others there tend to be more support points than are required in the homoscedastic case. We also exhibit minimax designs, that minimize the maximum, over finite classes of variance functions, value of the loss. hese also tend to have more support points, often resulting from the breaking down of replicates into clusters. Keywords Linear programming Minimax Optimal design Polynomial regression 1 Introduction and summary Suppose that an experimenter makes n independent observations on random variables r.v.s Y at several locations x 1,...,x N. hose made at location x i are denoted {Y ij } n i so that n = N n i and have means x i β and variances σi = σ x i. We specify only that n i there is no requirement that observations be made at all locations, and D.P. Wiens Department of Mathematical and Statistical Sciences, University of Alberta, Edmonton, Alberta, Canada doug.wiens@ualberta.ca typically many of the n i will be zero. Consider the weighted least squares estimate of β, with weights w i > onx i and on Y ij. his estimate is, assuming the existence of the inverse, ˆβ = X W X X W ȳ, where X N p = x 1,...,x N, W N N = diagw 1,...,w N, N N = diag 1,..., N, i = n i /n, and ȳ is the N 1 vector of averages with elements ȳ i = n i Y ij /n i for definiteness, define ȳ i = when n i =. he p p covariance matrix of the normalized estimates n ˆβ is, with = diagσ1,...,σ N,given by C = X W X X W WX X W X. 1 he weights are typically computed from estimated variances and then C is random; we measure the loss by the expected value E[ C] of some scalar-valued function of this matrix. Possibilities upon which we shall concentrate are D C = logdet C, the logarithm of the generalized variance, and I C = tr XCX, the sum, over the design space, of the prediction variances. he problem addressed here is to find a design, i.e. a choice of, so as to minimize the loss in the face of uncertainty about the variances and errors in their estimation. We will assume that the target weights are the inverses of the variances since these result in regression estimates with maximum efficiency and that the variances will be estimated, once the data are gathered, through the n i observations made at x i. We entertain a scenario under which the target values are missed because of multiplicative, positive

2 39 Stat Comput 13 3: random error: w i = 1ˆσ = 1 Z i σi i, for positive r.v.s Z i.his results in the probability model log w i = log σ i + η i. he r.v.s η i = log Z i are assumed to be independent, with zero means and variances ση /n i. he problem of weighting heteroscedastic data in a regression is commonly treated strictly as an estimation problem. Fuller and Rao 1978, Carroll and Ruppert 198, Shao 199 and Hooper 1993 made a variety of proposals for this, all with the aim of finding weights inversely proportional to the variances. reating this as a design problem, Dette et al. 5 obtained optimal designs, minimax for certain classes of variance functions assumed to be estimated without error. Wiens 1998 and Fang and Wiens proposed both weights and designs with minimax properties. he maxima were evaluated over classes of departures from both homoscedasticity and from the fitted regression response. In each case the minimax weights depended in a rather involved manner on the design weights and on the least favourable variances. Wiens obtained designs and weights for homoscedastic regression models, with possibly misspecified response functions. hese were required to minimize a function of the covariance matrix of the regression estimates, under a side condition of unbiasedness. he resulting weights could roughly be described as being inversely proportional to the norms of the vectors of regressors. None of the papers detailed above explicitly addresses the errors caused by the estimation of the variances; it is the purpose of the current article to fill this gap in the literature. We envision that the experimenter will choose a design, gather the data, and then estimate the variances and carry out a final, weighted least squares regression. Our model allows for the estimation of the variances to be done in a variety of ways. We first obtain designs optimal when the {σi } are correctly specified. his restriction is tempered by the fact that the optimal designs seem to vary slowly with the variance functions. Nonetheless, we present as well examples in which the designs are i simultaneously optimal over a continuum of variance functions Example 3, and ii minimax optimal, over a finite class of variance functions Example 6. In the next section we give an expansion of E[ C] in powers of n /, and show that the On / term vanishes. his derivation, and some others that are long or less central to the development, are in Appendix. We then approximate the loss by the sum of the constant and On terms, and seek to choose { i } in order to minimize this approximate loss. At this point the requirement that i be an integral multiple of n is dropped, thus yielding continuous designs that must be approximated, in a manner described in Sect. 4, by implementable exact designs. In Sect. 3 we present some analytic characterizations of optimal designs. We present a necessary and sufficient first order condition for a design to furnish a local minimum of the loss, and go on to investigate global optimality. he theory is illustrated through some simple examples; the conditions are however sufficiently complex that our more comprehensive designs are studied numerically. Section 4 contains examples of optimal designs in the case of polynomial regression, illustrating how these change with ση /n and with the structure of {σ i }. A message to be gleaned from the theory and examples presented here is that, relative to designs for homoscedastic regression models, those for estimated heteroscedasticity should have more design points, especially in regions of small anticipated variability, and that some of the replicates should be replaced by clusters of observations in nearby but distinct locations. his last point has often also been made in studies of designs which are to be robust against misspecified responses, and so the designs presented here can be expected to inherit some robustness in the latter situations. Expansions of the loss functions We make the following assumptions: A1 he function possesses two continuous derivatives with respect to each element of its argument. A he expectations E[wi t ] exist in an open neighbourhood of t =. Assumption A, together with w i = e η i /σi, ensures that all η i have moment generating functions, hence all moments are finite. We will write R n resp., r n generically, to denote a matrix resp., a scalar that is o p n and for which the expectation is on. We seek an expansion of the form C = C + 1 C n n C + R n, 3 with C being non-random, and E[C 1 ]=. It will in fact turn out that C 1 =. he assumptions on then allow the expansion [ 1 C = C + tr C n C ] n C with + 1 n vec C 1 C vecc 1 + r n, E [ C ] = C + 1 n { tr C E[C ] + 1 tr C COV[ vecc1 ]} + r n.

3 Stat Comput 13 3: Here vec refers to the concatenation of the columns of a matrix, C is the p p matrix with i, jth element C/ c ij and C is the p p Hessian matrix. We then approximate E[ C] by L = C + 1 { tr C E[C ] n S x Cxdx = tr[ca], with A = S xx dx replacing X X = N x i x i in I C and I C, and little anticipated effect on the resulting designs. 3 Design optimality + 1 tr C COV[ vecc1 ]}, 4 where = 1,..., N. he following expansion of L is derived in Appendix. heorem 1 With notation as above, and with U = / X = u 1,...,u N and τ = ση /n, we have that C = for = U U, and that the loss 4 is L = u i + τ N 1 i u i he particular cases noted above are D C = logdet C, with D C = C and L D = logdet + τ 1 i u i. I C = trxcx, with I C = X X and L I = tr X X Remarks + τ 1 i u i u i ; 6 X he parameter τ introduced in heorem 1 can be chosen by the designer without a knowledge of ση, to express the faith that he is willing to place in the accuracy of his variance estimates. If τ = then he designs in anticipation of estimating these variances, and hence the efficient weights, perfectly. In this case only the leading term in 5 is to be minimized by the design. Larger values of τ correspond to a more conservative approach and a reliance on the second term in the expansion 5.. A reviewer has asked about the possible impact of altering the I-criterion by replacing the sum of the prediction variances by their integral over a continuous version of the design space the convex hull of the set {x i } N would be one possibility. Denoting this continuous extension by S, wewouldhave I C = We say that a design is locally optimal if it furnishes a local minimum of the loss L, and globally optimal if it furnishes a global minimum. he conditions for local optimality are thus weaker than those for global optimality, but are correspondingly much easier to verify. In some cases the locally optimal designs found in this way also turn out to be globally optimal. In other cases we have found locally optimal designs but have been unable to answer the question of global optimality. he various Equivalence theorems in the literature offer little guidance here, since they apply at most only to the leading term of our expansion 4. In this section we study these notions, restricting to the cases of D- and I-optimality. 3.1 Local optimality he following theorem gives an easily checked condition to verify that a proposed minimizer furnishes at least a local minimum of the loss. First consider L D at 6. We show in Appendix that the gradient L D def = c is the N 1 vector with elements c,i = q,i τ { q,i + Q ii Q } D q Q ii, i = 1,...,N, 8 where Q = U U = U U U U, and D q = diagq,1,...,q,n with q,i = u i.we write and D q and later D r to denote evaluation at. heorem In order that furnish a local minimum of L D, it is necessary and sufficient to have c,i k def = c, i = 1,...,N, with equality on the support S ={i,i > }. he value of the constant is k = p τ tr[q D q D q ] [ N ] = p τ q,j q,j,j. 9

4 394 Stat Comput 13 3: Proof Suppose first that furnishes a local minimum. hen it is necessary that the directional derivative, in the direction of any distribution 1, be non-negative: d dt L D 1 t + t 1 t= = c 1 = c,i k 1,i. 1 Since 1 is arbitrary, 1 requires c,i k for i = 1,...,N. hen putting 1 = in 1 yields that c,i = k on S. hat k isgivenby9 is a straightforward calculation. Conversely, if the conditions of the theorem hold, then the inequality in 1 is immediate, and so furnishes a local minimum. For I-optimality we require further definitions. Let R = U X X U, with diagonal elements r,i = u i X X. Set D r = diagr,1,...,r,n. he gradient L I def = c again derived in Appendix is the N 1 vector with elements c,i = r,i τ { q,i r,i + R Q ii R D q Q ii Q D r Q ii }, i = 1,...,N. 11 he proof of the following theorem is identical to that of heorem, although the final expressions are less amenable to simplification. heorem 3 In order that furnish a local minimum of L I, it is necessary and sufficient to have c,i k def = c, i = 1,...,N, with equality on the support S ={i,i > }. he value of the constant is [ k = tr X X N τ,j q,j r,j + trr trr D q Q tr Q D r Q ]. 1 Example 1 Suppose that p = 1, i.e. that there is no intercept and only one independent variable. We verify that the design that places all mass where x i /σ i is a maximum satisfies the conditions of heorems and 3, hence is locally both D- and I-optimal. For this put = x i /σ i and s i = u i, and order these as s 1 s N. he support of is S ={i s i = s N }. We calculate from 8 and 9 that { c,i = s i si τ s i s N s N sn { k = τ + 1 s j }, s N and then c,i k = s N s i s N s j }, [ { }] 1 + τ N s i + s j, s N with equality on S, so that heorem applies. Similarly, from 11 and 1, [ c,i = x sn s i [ k = x sn s N }] 3sN, τ s N τ s N { N si + s j s i 3s i s N }], { N sn + s j s N and then [ { c,i k = s N s i sn x 1 + τ s i + s N, N }] s j with equality on S. We revisit this example in the next section, and show that this design is in fact globally D- and I-optimal. Example o illustrate the satisfaction of the conditions of heorem, we use the methods described in Sect. 4 below to obtain a locally D-optimal design for quartic regression. he independent variable x takes on values uniformly spaced over [, 1]: x i = + i 1, i = 1,...,N, 13 N 1 with N = 1. he variance function is σ i 1 + x i d, 14 with d = 1. he MALAB code is available from us. We take τ = 1 and obtain the design given in the left plot of Fig. 1. In the right plot we superimpose the scaled values of c,i k, illustrating the satisfaction of the conditions of heorem.

5 Stat Comput 13 3: Fig. 1 D-optimal design,i plotted against x i for quartic regression together with values of c,i k as described in Example Example 3 Suppose that one anticipates fitting a straight line response with an intercept, with possible design points = x 1 <x < <x N <x N = 1. Assume that these and the variances are symmetric: x N i+1 = x i and σ N i+1 = σ i. With constant variances, the D- and I-optimal design places half of its mass at each of x =±1. Under what conditions on {σ i } does this design continue to satisfy the conditions of heorems or 3 for local D- or I- optimality? For this design = 1 δ x δ x N we have, with ν i = σi /σ N, that q,i = 1 + xi /ν i and c,i k = q,i + τm i, for M i = 1 + x i xj q,j ν i ν j q,i. Note that c,i k = fori = 1,N; thus if q,i, i.e. ν i 1 + x i, i =,...,N 1, 15 then the conditions of heorem hold for τ τ D, where { min τ D = {i Mi <}, otherwise. q,i M i, if {i M i < }, We note that the variance functions 14 satisfy 15for d. Similarly, [ ] c,i k = σn N 1 1νi + x 1 x i + τ M i, ν i for M i = N q,i x 1 x i ν i ν j ν i x j q,i + 4 ; ν j here x = x 1,...,x N. hus if N 1 1νi + x 1 x i, i =,...,N 1, ν i then the conditions of heorem 3 hold for τ τ I, where τ I = min N1 ν 1 + x 1 x i i ν i {i M i <}, if {i M M i i < },, otherwise. See Fig. for plots of τ D and τ I vs. d, when the design space and variance functions are as at 13, with N = 1, and 14 respectively. 3. lobal optimality Recall the definitions of and from heorem 1. We can view a design as a probability distribution, and then if P u = = i we have that = i u [ i = E uu ]. 16 hus, define to be the set of all N-point distributions, and define to be the set of positive definite matrices that arise as in 16: = { > and = E [ uu ] for some }. Note that is a convex subset of the set of symmetric p p matrices. For let = be any distribution for which = E [uu ], and define ={ },

6 396 Stat Comput 13 3: Fig. Values of τ D left; = for d<.43 and τ I right; = for d<.156 versus d for variances given by 14and N = 1. he two-point design.5δ +.5δ +1 is D-optimal for τ [,τ D ] and I-optimal for τ [,τ I ].Whend>1. resp., d>.45 the interval [,τ D ] resp., [,τ I ]isempty a convex subset of. It is easily verified that, for any, L D inf inf L D, and so a path to the solution is to minimize in stages first over for fixed, then over. From6 it is seen that the first stage in this process is equivalent to maximizing E [u u ]; this is continued in the following theorem. heorem 4 With notation as above, set ν D = max E [u u ] for, and c i = u i. hen ν D = min M max,...,n { ci + tr M u } i, 17 where the minimum is over symmetric, p p matrices M. Let { [ D = arg min logdet + τ tr U U ] τν D }, 18 assuming that the minimum is attained. hen the minimum value of L D is L D D = logdet D + τ tr [ U ] D U τν D D, 19 and the minimizing D is any design maximizing E [u D u ] within the class of designs satisfying E [uu ]= D. Proof For fixed =, wehave L D = logdet + τ tr [ U U ] τe [ u u ], and we first seek to minimize this over.hisisalinear programming problem. Let vec S denote the pp + 1/ 1 vector consisting of those elements g jk, that appear in vec, but with j k, i.e. the columnwise representation of the upper triangle of the symmetric matrix. Expressing the constraints = E [uu ], i = 1, i in linear programming notation, the linear program becomes maximize c subject to V = d and N 1, where p pp + 1 = + 1, 1a W pp+1 V p N = N for 1b 1 N W = vec S u1 u 1... vecs un u N, pp+1 d p 1 = 1 and 1 c N 1 = c 1,...,c N. 1c 1d he set of feasible solutions is non-empty by the definition of and then standard linear programming theory ensures that the maximum is attained at an extreme point of the convex set generated by the set of feasible solutions. Since this set is bounded 1for the maximum is finite. he dual problem is to find μ p 1 to minimize d μ subject to V μ c. Equivalently, minimize μ p subject to μ p + μ 1,...,μpp+1 vec S ui u i c j for i = 1,...,N.

7 Stat Comput 13 3: By the Duality heorem ass 1975, p. 119 the solutions to these problems have the same extrema: ν D = μ p. 3 o analyze this, first relabel μ 1,...,μpp+1 as μ 11 ; μ 1, μ ; ;μ 1p,μ p,...,μ pp, and define a symmetric matrix M by μ jk /, j <k, m jk = μ jj, j = k, μ kj / j>k. hen L I I = tr X I X + τ tr [ X I U U I X ] τν I I, and the minimizing I is any design maximizing E [u uu X X u] within the class of designs satisfying E [uu ]= I. Remark In some cases the continuation of Example 1 below, for instance the condition E [uu ]= D determines the optimizing D uniquely. If there is more than one design satisfying this condition, then the required optimizer is the solution to the linear program described by and 1. For I-optimality, c i in 1d is replaced by c i. μ 1,...,μpp+1 vec S uj u j p = m jk ui u i jk j,k=1 = tr M u i, and becomes minimize μ p { subject to μ p max ci + tr M u i,...,n hus μ p is the minimum, over symmetric matrices M, of this maximum; using 3 as well we obtain 17. Now min L D = logdet + τ tr [ U U ] τν D, and so if the minimum in 18 is attained we immediately have 19. For L I at 7 we obtain the following, whose proof is essentially identical to that of heorem 4 and so is omitted. heorem 5 With notation as above, set ν I = max E [u uu X X u] for, and c i = u i u i X X. hen ν I = min M max,...,n { ci + tr M u } i, where the minimum is over symmetric, p p matrices M. Let { I = arg min tr X X + τ tr [ X U U X ] τν I }, assuming that the minimum is attained. hen the minimum value of L I is }. Example 1 continued Assume that s 1 >, so that = {g g [s 1,s N ]}. his assumption does not affect the end result, but simplifies the discussion. For g we find that ν D g = min m max,...,n { s } i g ms i + mg. he maximum of the quadratic in s is attained at one of the extremes: { s } ν D g = min max 1 m g +mg s 1, s N g ms N g. his maximum of two linear functions of m one increasing, the other decreasing is minimized at the point of intersection: m = s N + s 1 g, with ν D g = s N + s 1 g s N s 1 g. hen 18 becomes g D = arg min g [s 1,s N ] logg + τ g [ N i= ] s i + s Ns 1. g he function being minimized is a decreasing function of g, minimized at g = s N. he only design attaining E [S]= s N is that placing all mass at s N, and so this design is globally rather than merely locally D-optimal. For I-optimality we instead find that { s ν I g = min max 1 m g 3 x + mg s 1, sn } g 3 x ms n g = x g ν Dg,

8 398 Stat Comput 13 3: able 1 Designs for straight line regression, as in Example 4. Support points x i together with D- and I-optimal design frequencies n i D, n i I x i d = d = d = 4 n i D n i I n i D n i I n i D n i I τ = τ = τ = g I = arg min g [s 1,s N ] x g [ N + τ x g i= ] s i + s Ns 1. g he minimum is again attained at g = s N, and so the design placing all mass at s N is also globally I-optimal. 4 Examples: polynomial regression We have minimized the loss functions L D and L I, concentrating on polynomial regression with design space given by 13 with N = 1. hus the jth column of X N p is x j 1,...,x j N for j = 1,...,p, and the degree of the polynomial to be fitted is q = p 1. Although the theory of Sect. 3. gives insight into the structure of the solutions, it falls short of being conveniently implemented numerically. hus the minimization of 6 and 7 was carried out directly, using nonlinear constrained minimization routines in MALAB. he specific examples detailed here have assumed the variance functions 14 and a design of size n = 1. he designs are not affected by the constant of proportionality, which we choose such that N σi = 1. In each case the exact, minimizing values { i } are obtained, and then integer allocations n i n i are obtained using the efficient design apportionment of Pukelsheim his is a rounding procedure with, among others, the property of sample size monotonicity if a new point is to be allocated to an existing design, then none of the current allocations will be able Designs for cubic regression, as in Example 5. Support points x i together with D- and I-optimal design frequencies n i D, n i I x i d = d = d = 4 n i D n i I n i D n i I n i D n i I τ = τ = τ = reduced. Although the { i } are symmetric N i+1 = i the rounding sometimes results in slight asymmetries. Example 4 covers straight line regression. Designs for cubic regression are given in Example 5. In Example 6 we present minimax designs, which minimize the maximum loss, with the maximum evaluated over a range of values of d in 14. Example 4 Here we take q = 1 straight line regression. Designs minimizing L D and L I are detailed in able 1, for the combinations of τ =,.5, 1 and d =,, 4. For each pair τ, d the D- and I-optimal designs turn out to be supported on the same points. When d = they coincide with the variance minimizing design under homoscedasticity. For d> they differ, in manners depending on the type of optimality and on τ ; in particular they are typically supported on three points rather than two. Note also that for large d the masses at ±1 move towards the centre. Example 5 Here we take q = 3. See able. When τ = d = the D-optimal design seems to be attempting an approximation, in our discrete design space, of the D- optimal design on the continuous interval [, 1]; this continuous design places mass of.5 at each of the points ±1,

9 Stat Comput 13 3: able 3 Minimax designs for straight line q = 1 and cubic q = 3 regression, as in Example 6. Support points x i together with D- and I-optimal design frequencies n i and least favourable powers d x i q = 1 q = 3 D I D I n i d n i d n i d n i d τ = τ = τ = ± Acknowledgements his research has been supported by the Natural Sciences and Engineering Research Council of Canada, and has benefited from the incisive comments of three anonymous reviewers. Appendix: Derivations Proof of heorem 1 he probability model can be written as W = E, where E = diage i with e i = e η i. hen 1 becomes C = U E U U E EU U E U. A.1 Let d i = n i η i = n i η i and note that the d i are independent, with zero mean and unit variance. With D = diagd i we have E = I + 1 / D + 1 n n D + R n, where R n = diagr in and r in n 3/ i di 3 /6. By A, R n = o p n and E[R n ]=on. Inserting these expressions into the components of A.1 gives U E U def = = + 1 n n + R n = U U, 1 = U 1/ DU, = 1 U D U, with = / for I + 1 / n 1 / + 1 n / / ±.447. A similar comment applies to the I-optimal design when τ = d =. In each case, increasing d results in a migration of mass towards the centre of the design space, in a manner that varies with τ. Example 6 For the values of q and τ used in Examples 4 and 5 we have obtained designs that minimize the maximum loss, as the power d in 14 varies over an 81 point grid spanning [ 4, 4]. See able 3. A notable feature of the minimax designs is that they tend to have more support points than do the designs of Examples 4 and 5, when these are evaluated at approximately the powers d that are least favourable for the minimax designs. As we have often noticed in other studies, the additional robustness of the minimax designs tends to be achieved by distributing some replicates into clusters of observations at nearby locations. and + / R n / / = 1 n { n 1 1 } + R n, U E EU def = H = + n n + R n. Substituting these expressions into C = H and expanding gives 3 with C =, C 1 =, C = 1 1.

10 4 Stat Comput 13 3: Substituting into 4gives L = C + 1 n tr C E[ 1 ] 1. o compute the expectation we note that E[D ]=ση I, and that E[DUMU D]=ση diag M for any non-random, p p matrix M. hese observations yield E [ 1 ] 1 = σ η U diag u i ] [ 1 E[ ]=E U D U U, = σ η U U, whence L = C + σ η n tr diag 1 i u i u i U which is 5. C U, Verification of 8 Define φβ,c = logdet C + τ hen N u i C τ β i u. i C τu D q U } vec u 1 u 1 vec u N u N, with L D N 1 = τ q,1. q,n u 1 u 1. u N u N vec { + τu U τu D q U }. his vector has ith element c,i = τq,i [ + τ τ U D q U ] ui, which reduces to 8. Verification of 11 Define φβ,c = tr XCX + τ hen τ U U XC β i u i C XCui. L D = φ,, with gradient φ given by L φ D = β β,c=, β,c=, We calculate that φ = τ β φ β,c=, β,c=, = + φ u i e i = τ q,1,...,q,n ;. β,c=, = vec { + τu U τu } D q U ; = vec u 1 u 1 vec u N u N. Here the Kronecker product is as in Srivastava and Khatri 1979, viz., A B = a ij B i,j. hus L D = τ q,1,...,q,n vec { + τu U L I = φ,, with gradient φ given by L I = φ β β,c=, β,c=, We calculate that φ = τ β φ hus β,c=, + φ u i. β,c=, X = τq,1 r,1,,q,n r,n ; e i = vec { X X + τx X U U L I = τq,1r,1,...,q,n r,n τx X U D q U τu D r U }. vec { X X + τx X U U τx X U D q U τu D r U } vec u1 u 1 vec u Nu N,

11 Stat Comput 13 3: and we then calculate that L I has ith element c,i = τq,i r,i u [ i X X + τx X U U τx X U D q U τu D r U ], which reduces to 11. References Carroll, R.J., Ruppert, D.: Robust estimation in heteroscedastic linear models. Ann. Stat. 1, Dette, H., Haines, L.M., Imhof, L.A.: Bayesian and maximin optimal designs for heteroscedastic regression models. Can. J. Stat. 33, Fang, Z., Wiens, D.P.: Integer-valued, minimax robust designs for estimation and extrapolation in heteroscedastic, approximately linear models. J. Am. Stat. Assoc. 95, Fuller, W.A., Rao, J.N.K.: Estimation for a linear model with unknown diagonal covariance matrix. Ann. Stat. 6, ass, S.I.: Linear Programming: Methods and Applications. Mcraw- Hill, New York 1975 Hooper, P.M.: Iterative weighted least squares estimation in heteroscedastic linear models. J. Am. Stat. Assoc. 88, Pukelsheim, F.: Optimal Design of Experiments. Wiley, New York 1993 Shao, J.: Empirical Bayes estimation of heteroscedastic variances. J. Am. Stat. Assoc. 88, Srivastava, M.S., Khatri, C..: An Introduction to Multivariate Statistics. North-Holland, New York 1979 Wiens, D.P.: Minimax robust designs and weights for approximately specified regression models with heteroscedastic errors. Stat. Sin., Wiens, D.P.: Robust weights and designs for biased regression models: least squares and generalized M-estimation. J. Stat. Plan. Inference 83,

Joint Statistical Meetings - Section on Statistics & the Environment

Joint Statistical Meetings - Section on Statistics & the Environment Robust Designs for Approximate Regression Models With Correlated Errors Douglas P. Wiens Department of Mathematical and tatistical ciences University of Alberta, Edmonton, Alberta, Canada T6G G1 doug.wiens@ualberta.ca

More information

EE 227A: Convex Optimization and Applications October 14, 2008

EE 227A: Convex Optimization and Applications October 14, 2008 EE 227A: Convex Optimization and Applications October 14, 2008 Lecture 13: SDP Duality Lecturer: Laurent El Ghaoui Reading assignment: Chapter 5 of BV. 13.1 Direct approach 13.1.1 Primal problem Consider

More information

Stochastic Design Criteria in Linear Models

Stochastic Design Criteria in Linear Models AUSTRIAN JOURNAL OF STATISTICS Volume 34 (2005), Number 2, 211 223 Stochastic Design Criteria in Linear Models Alexander Zaigraev N. Copernicus University, Toruń, Poland Abstract: Within the framework

More information

Minimax design criterion for fractional factorial designs

Minimax design criterion for fractional factorial designs Ann Inst Stat Math 205 67:673 685 DOI 0.007/s0463-04-0470-0 Minimax design criterion for fractional factorial designs Yue Yin Julie Zhou Received: 2 November 203 / Revised: 5 March 204 / Published online:

More information

Covariance function estimation in Gaussian process regression

Covariance function estimation in Gaussian process regression Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian

More information

CS 195-5: Machine Learning Problem Set 1

CS 195-5: Machine Learning Problem Set 1 CS 95-5: Machine Learning Problem Set Douglas Lanman dlanman@brown.edu 7 September Regression Problem Show that the prediction errors y f(x; ŵ) are necessarily uncorrelated with any linear function of

More information

Unsupervised Learning with Permuted Data

Unsupervised Learning with Permuted Data Unsupervised Learning with Permuted Data Sergey Kirshner skirshne@ics.uci.edu Sridevi Parise sparise@ics.uci.edu Padhraic Smyth smyth@ics.uci.edu School of Information and Computer Science, University

More information

ON STATISTICAL INFERENCE UNDER ASYMMETRIC LOSS. Abstract. We introduce a wide class of asymmetric loss functions and show how to obtain

ON STATISTICAL INFERENCE UNDER ASYMMETRIC LOSS. Abstract. We introduce a wide class of asymmetric loss functions and show how to obtain ON STATISTICAL INFERENCE UNDER ASYMMETRIC LOSS FUNCTIONS Michael Baron Received: Abstract We introduce a wide class of asymmetric loss functions and show how to obtain asymmetric-type optimal decision

More information

The properties of L p -GMM estimators

The properties of L p -GMM estimators The properties of L p -GMM estimators Robert de Jong and Chirok Han Michigan State University February 2000 Abstract This paper considers Generalized Method of Moment-type estimators for which a criterion

More information

Computational Statistics and Data Analysis. Robustness of design for the testing of lack of fit and for estimation in binary response models

Computational Statistics and Data Analysis. Robustness of design for the testing of lack of fit and for estimation in binary response models Computational Statistics and Data Analysis 54 (200) 337 3378 Contents lists available at ScienceDirect Computational Statistics and Data Analysis journal homepage: www.elsevier.com/locate/csda Robustness

More information

CONTROLLABILITY OF NONLINEAR DISCRETE SYSTEMS

CONTROLLABILITY OF NONLINEAR DISCRETE SYSTEMS Int. J. Appl. Math. Comput. Sci., 2002, Vol.2, No.2, 73 80 CONTROLLABILITY OF NONLINEAR DISCRETE SYSTEMS JERZY KLAMKA Institute of Automatic Control, Silesian University of Technology ul. Akademicka 6,

More information

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization

Some Properties of the Augmented Lagrangian in Cone Constrained Optimization MATHEMATICS OF OPERATIONS RESEARCH Vol. 29, No. 3, August 2004, pp. 479 491 issn 0364-765X eissn 1526-5471 04 2903 0479 informs doi 10.1287/moor.1040.0103 2004 INFORMS Some Properties of the Augmented

More information

y(x) = x w + ε(x), (1)

y(x) = x w + ε(x), (1) Linear regression We are ready to consider our first machine-learning problem: linear regression. Suppose that e are interested in the values of a function y(x): R d R, here x is a d-dimensional vector-valued

More information

460 HOLGER DETTE AND WILLIAM J STUDDEN order to examine how a given design behaves in the model g` with respect to the D-optimality criterion one uses

460 HOLGER DETTE AND WILLIAM J STUDDEN order to examine how a given design behaves in the model g` with respect to the D-optimality criterion one uses Statistica Sinica 5(1995), 459-473 OPTIMAL DESIGNS FOR POLYNOMIAL REGRESSION WHEN THE DEGREE IS NOT KNOWN Holger Dette and William J Studden Technische Universitat Dresden and Purdue University Abstract:

More information

Linear Regression and Its Applications

Linear Regression and Its Applications Linear Regression and Its Applications Predrag Radivojac October 13, 2014 Given a data set D = {(x i, y i )} n the objective is to learn the relationship between features and the target. We usually start

More information

Optimal Designs for 2 k Experiments with Binary Response

Optimal Designs for 2 k Experiments with Binary Response 1 / 57 Optimal Designs for 2 k Experiments with Binary Response Dibyen Majumdar Mathematics, Statistics, and Computer Science College of Liberal Arts and Sciences University of Illinois at Chicago Joint

More information

The deterministic Lasso

The deterministic Lasso The deterministic Lasso Sara van de Geer Seminar für Statistik, ETH Zürich Abstract We study high-dimensional generalized linear models and empirical risk minimization using the Lasso An oracle inequality

More information

Regression. Oscar García

Regression. Oscar García Regression Oscar García Regression methods are fundamental in Forest Mensuration For a more concise and general presentation, we shall first review some matrix concepts 1 Matrices An order n m matrix is

More information

Lecture 5: LDA and Logistic Regression

Lecture 5: LDA and Logistic Regression Lecture 5: and Logistic Regression Hao Helen Zhang Hao Helen Zhang Lecture 5: and Logistic Regression 1 / 39 Outline Linear Classification Methods Two Popular Linear Models for Classification Linear Discriminant

More information

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability

More information

On construction of constrained optimum designs

On construction of constrained optimum designs On construction of constrained optimum designs Institute of Control and Computation Engineering University of Zielona Góra, Poland DEMA2008, Cambridge, 15 August 2008 Numerical algorithms to construct

More information

MIT Algebraic techniques and semidefinite optimization February 14, Lecture 3

MIT Algebraic techniques and semidefinite optimization February 14, Lecture 3 MI 6.97 Algebraic techniques and semidefinite optimization February 4, 6 Lecture 3 Lecturer: Pablo A. Parrilo Scribe: Pablo A. Parrilo In this lecture, we will discuss one of the most important applications

More information

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS

COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS Communications in Statistics - Simulation and Computation 33 (2004) 431-446 COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS K. Krishnamoorthy and Yong Lu Department

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44

Convex Optimization. Newton s method. ENSAE: Optimisation 1/44 Convex Optimization Newton s method ENSAE: Optimisation 1/44 Unconstrained minimization minimize f(x) f convex, twice continuously differentiable (hence dom f open) we assume optimal value p = inf x f(x)

More information

Lagrange Multipliers

Lagrange Multipliers Optimization with Constraints As long as algebra and geometry have been separated, their progress have been slow and their uses limited; but when these two sciences have been united, they have lent each

More information

D-Optimal Designs for Second-Order Response Surface Models with Qualitative Factors

D-Optimal Designs for Second-Order Response Surface Models with Qualitative Factors Journal of Data Science 920), 39-53 D-Optimal Designs for Second-Order Response Surface Models with Qualitative Factors Chuan-Pin Lee and Mong-Na Lo Huang National Sun Yat-sen University Abstract: Central

More information

Weighted Least Squares

Weighted Least Squares Weighted Least Squares The standard linear model assumes that Var(ε i ) = σ 2 for i = 1,..., n. As we have seen, however, there are instances where Var(Y X = x i ) = Var(ε i ) = σ2 w i. Here w 1,..., w

More information

Lecture 2: Linear Algebra Review

Lecture 2: Linear Algebra Review EE 227A: Convex Optimization and Applications January 19 Lecture 2: Linear Algebra Review Lecturer: Mert Pilanci Reading assignment: Appendix C of BV. Sections 2-6 of the web textbook 1 2.1 Vectors 2.1.1

More information

On the interior of the simplex, we have the Hessian of d(x), Hd(x) is diagonal with ith. µd(w) + w T c. minimize. subject to w T 1 = 1,

On the interior of the simplex, we have the Hessian of d(x), Hd(x) is diagonal with ith. µd(w) + w T c. minimize. subject to w T 1 = 1, Math 30 Winter 05 Solution to Homework 3. Recognizing the convexity of g(x) := x log x, from Jensen s inequality we get d(x) n x + + x n n log x + + x n n where the equality is attained only at x = (/n,...,

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 254 Part V

More information

Duality (Continued) min f ( x), X R R. Recall, the general primal problem is. The Lagrangian is a function. defined by

Duality (Continued) min f ( x), X R R. Recall, the general primal problem is. The Lagrangian is a function. defined by Duality (Continued) Recall, the general primal problem is min f ( x), xx g( x) 0 n m where X R, f : X R, g : XR ( X). he Lagrangian is a function L: XR R m defined by L( xλ, ) f ( x) λ g( x) Duality (Continued)

More information

An introduction to Mathematical Theory of Control

An introduction to Mathematical Theory of Control An introduction to Mathematical Theory of Control Vasile Staicu University of Aveiro UNICA, May 2018 Vasile Staicu (University of Aveiro) An introduction to Mathematical Theory of Control UNICA, May 2018

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1

MA 575 Linear Models: Cedric E. Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 MA 575 Linear Models: Cedric E Ginestet, Boston University Mixed Effects Estimation, Residuals Diagnostics Week 11, Lecture 1 1 Within-group Correlation Let us recall the simple two-level hierarchical

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

On the Power of Robust Solutions in Two-Stage Stochastic and Adaptive Optimization Problems

On the Power of Robust Solutions in Two-Stage Stochastic and Adaptive Optimization Problems MATHEMATICS OF OPERATIONS RESEARCH Vol. 35, No., May 010, pp. 84 305 issn 0364-765X eissn 156-5471 10 350 084 informs doi 10.187/moor.1090.0440 010 INFORMS On the Power of Robust Solutions in Two-Stage

More information

Two hours. To be provided by Examinations Office: Mathematical Formula Tables. THE UNIVERSITY OF MANCHESTER. xx xxxx 2017 xx:xx xx.

Two hours. To be provided by Examinations Office: Mathematical Formula Tables. THE UNIVERSITY OF MANCHESTER. xx xxxx 2017 xx:xx xx. Two hours To be provided by Examinations Office: Mathematical Formula Tables. THE UNIVERSITY OF MANCHESTER CONVEX OPTIMIZATION - SOLUTIONS xx xxxx 27 xx:xx xx.xx Answer THREE of the FOUR questions. If

More information

The Hilbert Space of Random Variables

The Hilbert Space of Random Variables The Hilbert Space of Random Variables Electrical Engineering 126 (UC Berkeley) Spring 2018 1 Outline Fix a probability space and consider the set H := {X : X is a real-valued random variable with E[X 2

More information

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization

Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Extreme Abridgment of Boyd and Vandenberghe s Convex Optimization Compiled by David Rosenberg Abstract Boyd and Vandenberghe s Convex Optimization book is very well-written and a pleasure to read. The

More information

Spatial Process Estimates as Smoothers: A Review

Spatial Process Estimates as Smoothers: A Review Spatial Process Estimates as Smoothers: A Review Soutir Bandyopadhyay 1 Basic Model The observational model considered here has the form Y i = f(x i ) + ɛ i, for 1 i n. (1.1) where Y i is the observed

More information

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X.

Optimization. The value x is called a maximizer of f and is written argmax X f. g(λx + (1 λ)y) < λg(x) + (1 λ)g(y) 0 < λ < 1; x, y X. Optimization Background: Problem: given a function f(x) defined on X, find x such that f(x ) f(x) for all x X. The value x is called a maximizer of f and is written argmax X f. In general, argmax X f may

More information

Some New Properties of Wishart Distribution

Some New Properties of Wishart Distribution Applied Mathematical Sciences, Vol., 008, no. 54, 673-68 Some New Properties of Wishart Distribution Evelina Veleva Rousse University A. Kanchev Department of Numerical Methods and Statistics 8 Studentska

More information

Noisy Streaming PCA. Noting g t = x t x t, rearranging and dividing both sides by 2η we get

Noisy Streaming PCA. Noting g t = x t x t, rearranging and dividing both sides by 2η we get Supplementary Material A. Auxillary Lemmas Lemma A. Lemma. Shalev-Shwartz & Ben-David,. Any update of the form P t+ = Π C P t ηg t, 3 for an arbitrary sequence of matrices g, g,..., g, projection Π C onto

More information

FINITE HORIZON ROBUST MODEL PREDICTIVE CONTROL USING LINEAR MATRIX INEQUALITIES. Danlei Chu, Tongwen Chen, Horacio J. Marquez

FINITE HORIZON ROBUST MODEL PREDICTIVE CONTROL USING LINEAR MATRIX INEQUALITIES. Danlei Chu, Tongwen Chen, Horacio J. Marquez FINITE HORIZON ROBUST MODEL PREDICTIVE CONTROL USING LINEAR MATRIX INEQUALITIES Danlei Chu Tongwen Chen Horacio J Marquez Department of Electrical and Computer Engineering University of Alberta Edmonton

More information

Regression diagnostics

Regression diagnostics Regression diagnostics Kerby Shedden Department of Statistics, University of Michigan November 5, 018 1 / 6 Motivation When working with a linear model with design matrix X, the conventional linear model

More information

The Matrix Algebra of Sample Statistics

The Matrix Algebra of Sample Statistics The Matrix Algebra of Sample Statistics James H. Steiger Department of Psychology and Human Development Vanderbilt University James H. Steiger (Vanderbilt University) The Matrix Algebra of Sample Statistics

More information

A note on profile likelihood for exponential tilt mixture models

A note on profile likelihood for exponential tilt mixture models Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential

More information

On the Efficiencies of Several Generalized Least Squares Estimators in a Seemingly Unrelated Regression Model and a Heteroscedastic Model

On the Efficiencies of Several Generalized Least Squares Estimators in a Seemingly Unrelated Regression Model and a Heteroscedastic Model Journal of Multivariate Analysis 70, 8694 (1999) Article ID jmva.1999.1817, available online at http:www.idealibrary.com on On the Efficiencies of Several Generalized Least Squares Estimators in a Seemingly

More information

MIVQUE and Maximum Likelihood Estimation for Multivariate Linear Models with Incomplete Observations

MIVQUE and Maximum Likelihood Estimation for Multivariate Linear Models with Incomplete Observations Sankhyā : The Indian Journal of Statistics 2006, Volume 68, Part 3, pp. 409-435 c 2006, Indian Statistical Institute MIVQUE and Maximum Likelihood Estimation for Multivariate Linear Models with Incomplete

More information

Mean squared error matrix comparison of least aquares and Stein-rule estimators for regression coefficients under non-normal disturbances

Mean squared error matrix comparison of least aquares and Stein-rule estimators for regression coefficients under non-normal disturbances METRON - International Journal of Statistics 2008, vol. LXVI, n. 3, pp. 285-298 SHALABH HELGE TOUTENBURG CHRISTIAN HEUMANN Mean squared error matrix comparison of least aquares and Stein-rule estimators

More information

On the relative strength of families of intersection cuts arising from pairs of tableau constraints in mixed integer programs

On the relative strength of families of intersection cuts arising from pairs of tableau constraints in mixed integer programs On the relative strength of families of intersection cuts arising from pairs of tableau constraints in mixed integer programs Yogesh Awate Tepper School of Business, Carnegie Mellon University, Pittsburgh,

More information

STAT 730 Chapter 4: Estimation

STAT 730 Chapter 4: Estimation STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum

More information

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8

Peter Hoff Linear and multilinear models April 3, GLS for multivariate regression 5. 3 Covariance estimation for the GLM 8 Contents 1 Linear model 1 2 GLS for multivariate regression 5 3 Covariance estimation for the GLM 8 4 Testing the GLH 11 A reference for some of this material can be found somewhere. 1 Linear model Recall

More information

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed

Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 18.466 Notes, March 4, 2013, R. Dudley Maximum likelihood estimation: actual or supposed 1. MLEs in exponential families Let f(x,θ) for x X and θ Θ be a likelihood function, that is, for present purposes,

More information

Lectures on Simple Linear Regression Stat 431, Summer 2012

Lectures on Simple Linear Regression Stat 431, Summer 2012 Lectures on Simple Linear Regression Stat 43, Summer 0 Hyunseung Kang July 6-8, 0 Last Updated: July 8, 0 :59PM Introduction Previously, we have been investigating various properties of the population

More information

MATH 205C: STATIONARY PHASE LEMMA

MATH 205C: STATIONARY PHASE LEMMA MATH 205C: STATIONARY PHASE LEMMA For ω, consider an integral of the form I(ω) = e iωf(x) u(x) dx, where u Cc (R n ) complex valued, with support in a compact set K, and f C (R n ) real valued. Thus, I(ω)

More information

Function of Longitudinal Data

Function of Longitudinal Data New Local Estimation Procedure for Nonparametric Regression Function of Longitudinal Data Weixin Yao and Runze Li Abstract This paper develops a new estimation of nonparametric regression functions for

More information

A Note on Bootstraps and Robustness. Tony Lancaster, Brown University, December 2003.

A Note on Bootstraps and Robustness. Tony Lancaster, Brown University, December 2003. A Note on Bootstraps and Robustness Tony Lancaster, Brown University, December 2003. In this note we consider several versions of the bootstrap and argue that it is helpful in explaining and thinking about

More information

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

Bayesian vs frequentist techniques for the analysis of binary outcome data

Bayesian vs frequentist techniques for the analysis of binary outcome data 1 Bayesian vs frequentist techniques for the analysis of binary outcome data By M. Stapleton Abstract We compare Bayesian and frequentist techniques for analysing binary outcome data. Such data are commonly

More information

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra.

DS-GA 1002 Lecture notes 0 Fall Linear Algebra. These notes provide a review of basic concepts in linear algebra. DS-GA 1002 Lecture notes 0 Fall 2016 Linear Algebra These notes provide a review of basic concepts in linear algebra. 1 Vector spaces You are no doubt familiar with vectors in R 2 or R 3, i.e. [ ] 1.1

More information

Deviation Measures and Normals of Convex Bodies

Deviation Measures and Normals of Convex Bodies Beiträge zur Algebra und Geometrie Contributions to Algebra Geometry Volume 45 (2004), No. 1, 155-167. Deviation Measures Normals of Convex Bodies Dedicated to Professor August Florian on the occasion

More information

THE CYCLIC DOUGLAS RACHFORD METHOD FOR INCONSISTENT FEASIBILITY PROBLEMS

THE CYCLIC DOUGLAS RACHFORD METHOD FOR INCONSISTENT FEASIBILITY PROBLEMS THE CYCLIC DOUGLAS RACHFORD METHOD FOR INCONSISTENT FEASIBILITY PROBLEMS JONATHAN M. BORWEIN AND MATTHEW K. TAM Abstract. We analyse the behaviour of the newly introduced cyclic Douglas Rachford algorithm

More information

Nonlinear Optimization

Nonlinear Optimization Nonlinear Optimization (Com S 477/577 Notes) Yan-Bin Jia Nov 7, 2017 1 Introduction Given a single function f that depends on one or more independent variable, we want to find the values of those variables

More information

Tangent spaces, normals and extrema

Tangent spaces, normals and extrema Chapter 3 Tangent spaces, normals and extrema If S is a surface in 3-space, with a point a S where S looks smooth, i.e., without any fold or cusp or self-crossing, we can intuitively define the tangent

More information

Polynomial chaos expansions for sensitivity analysis

Polynomial chaos expansions for sensitivity analysis c DEPARTMENT OF CIVIL, ENVIRONMENTAL AND GEOMATIC ENGINEERING CHAIR OF RISK, SAFETY & UNCERTAINTY QUANTIFICATION Polynomial chaos expansions for sensitivity analysis B. Sudret Chair of Risk, Safety & Uncertainty

More information

Mathematics Department Stanford University Math 61CM/DM Inner products

Mathematics Department Stanford University Math 61CM/DM Inner products Mathematics Department Stanford University Math 61CM/DM Inner products Recall the definition of an inner product space; see Appendix A.8 of the textbook. Definition 1 An inner product space V is a vector

More information

Homework 4. Convex Optimization /36-725

Homework 4. Convex Optimization /36-725 Homework 4 Convex Optimization 10-725/36-725 Due Friday November 4 at 5:30pm submitted to Christoph Dann in Gates 8013 (Remember to a submit separate writeup for each problem, with your name at the top)

More information

Linear Regression Linear Regression with Shrinkage

Linear Regression Linear Regression with Shrinkage Linear Regression Linear Regression ith Shrinkage Introduction Regression means predicting a continuous (usually scalar) output y from a vector of continuous inputs (features) x. Example: Predicting vehicle

More information

g(t) = f(x 1 (t),..., x n (t)).

g(t) = f(x 1 (t),..., x n (t)). Reading: [Simon] p. 313-333, 833-836. 0.1 The Chain Rule Partial derivatives describe how a function changes in directions parallel to the coordinate axes. Now we shall demonstrate how the partial derivatives

More information

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions

Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions International Journal of Control Vol. 00, No. 00, January 2007, 1 10 Stochastic Optimization with Inequality Constraints Using Simultaneous Perturbations and Penalty Functions I-JENG WANG and JAMES C.

More information

Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1)

Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1) Chapter 2. Binary and M-ary Hypothesis Testing 2.1 Introduction (Levy 2.1) Detection problems can usually be casted as binary or M-ary hypothesis testing problems. Applications: This chapter: Simple hypothesis

More information

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

Last updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship

More information

Distributionally Robust Discrete Optimization with Entropic Value-at-Risk

Distributionally Robust Discrete Optimization with Entropic Value-at-Risk Distributionally Robust Discrete Optimization with Entropic Value-at-Risk Daniel Zhuoyu Long Department of SEEM, The Chinese University of Hong Kong, zylong@se.cuhk.edu.hk Jin Qi NUS Business School, National

More information

The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series

The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series Willa W. Chen Rohit S. Deo July 6, 009 Abstract. The restricted likelihood ratio test, RLRT, for the autoregressive coefficient

More information

Covering the Plane with Translates of a Triangle

Covering the Plane with Translates of a Triangle Discrete Comput Geom (2010) 43: 167 178 DOI 10.1007/s00454-009-9203-1 Covering the Plane with Translates of a Triangle Janusz Januszewski Received: 20 December 2007 / Revised: 22 May 2009 / Accepted: 10

More information

Computational Aspects of Aggregation in Biological Systems

Computational Aspects of Aggregation in Biological Systems Computational Aspects of Aggregation in Biological Systems Vladik Kreinovich and Max Shpak University of Texas at El Paso, El Paso, TX 79968, USA vladik@utep.edu, mshpak@utep.edu Summary. Many biologically

More information

1 Data Arrays and Decompositions

1 Data Arrays and Decompositions 1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is

More information

The Bock iteration for the ODE estimation problem

The Bock iteration for the ODE estimation problem he Bock iteration for the ODE estimation problem M.R.Osborne Contents 1 Introduction 2 2 Introducing the Bock iteration 5 3 he ODE estimation problem 7 4 he Bock iteration for the smoothing problem 12

More information

3. QUANTILE-REGRESSION MODEL AND ESTIMATION

3. QUANTILE-REGRESSION MODEL AND ESTIMATION 03-Hao.qxd 3/13/2007 5:24 PM Page 22 22 Combining these two partial derivatives leads to: m + y m f(y)dy = F (m) (1 F (m)) = 2F (m) 1. [A.2] By setting 2F(m) 1 = 0, we solve for the value of F(m) = 1/2,

More information

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A.

Fall 2017 STAT 532 Homework Peter Hoff. 1. Let P be a probability measure on a collection of sets A. 1. Let P be a probability measure on a collection of sets A. (a) For each n N, let H n be a set in A such that H n H n+1. Show that P (H n ) monotonically converges to P ( k=1 H k) as n. (b) For each n

More information

Matrix Factorizations

Matrix Factorizations 1 Stat 540, Matrix Factorizations Matrix Factorizations LU Factorization Definition... Given a square k k matrix S, the LU factorization (or decomposition) represents S as the product of two triangular

More information

Sociedad de Estadística e Investigación Operativa

Sociedad de Estadística e Investigación Operativa Sociedad de Estadística e Investigación Operativa Test Volume 14, Number 2. December 2005 Estimation of Regression Coefficients Subject to Exact Linear Restrictions when Some Observations are Missing and

More information

Constrained optimization

Constrained optimization Constrained optimization DS-GA 1013 / MATH-GA 2824 Optimization-based Data Analysis http://www.cims.nyu.edu/~cfgranda/pages/obda_fall17/index.html Carlos Fernandez-Granda Compressed sensing Convex constrained

More information

Lecture 15 Newton Method and Self-Concordance. October 23, 2008

Lecture 15 Newton Method and Self-Concordance. October 23, 2008 Newton Method and Self-Concordance October 23, 2008 Outline Lecture 15 Self-concordance Notion Self-concordant Functions Operations Preserving Self-concordance Properties of Self-concordant Functions Implications

More information

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors:

statistical sense, from the distributions of the xs. The model may now be generalized to the case of k regressors: Wooldridge, Introductory Econometrics, d ed. Chapter 3: Multiple regression analysis: Estimation In multiple regression analysis, we extend the simple (two-variable) regression model to consider the possibility

More information

Efficient packing of unit squares in a square

Efficient packing of unit squares in a square Loughborough University Institutional Repository Efficient packing of unit squares in a square This item was submitted to Loughborough University's Institutional Repository by the/an author. Additional

More information

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings

Structural and Multidisciplinary Optimization. P. Duysinx and P. Tossings Structural and Multidisciplinary Optimization P. Duysinx and P. Tossings 2018-2019 CONTACTS Pierre Duysinx Institut de Mécanique et du Génie Civil (B52/3) Phone number: 04/366.91.94 Email: P.Duysinx@uliege.be

More information

Navigation and Obstacle Avoidance via Backstepping for Mechanical Systems with Drift in the Closed Loop

Navigation and Obstacle Avoidance via Backstepping for Mechanical Systems with Drift in the Closed Loop Navigation and Obstacle Avoidance via Backstepping for Mechanical Systems with Drift in the Closed Loop Jan Maximilian Montenbruck, Mathias Bürger, Frank Allgöwer Abstract We study backstepping controllers

More information

Linear Regression Linear Regression with Shrinkage

Linear Regression Linear Regression with Shrinkage Linear Regression Linear Regression ith Shrinkage Introduction Regression means predicting a continuous (usually scalar) output y from a vector of continuous inputs (features) x. Example: Predicting vehicle

More information

Optimal design of experiments

Optimal design of experiments Optimal design of experiments Session 4: Some theory Peter Goos / 40 Optimal design theory continuous or approximate optimal designs implicitly assume an infinitely large number of observations are available

More information

Measurement error as missing data: the case of epidemiologic assays. Roderick J. Little

Measurement error as missing data: the case of epidemiologic assays. Roderick J. Little Measurement error as missing data: the case of epidemiologic assays Roderick J. Little Outline Discuss two related calibration topics where classical methods are deficient (A) Limit of quantification methods

More information

Lagrange Relaxation and Duality

Lagrange Relaxation and Duality Lagrange Relaxation and Duality As we have already known, constrained optimization problems are harder to solve than unconstrained problems. By relaxation we can solve a more difficult problem by a simpler

More information

arxiv:hep-ex/ v1 2 Jun 2000

arxiv:hep-ex/ v1 2 Jun 2000 MPI H - V7-000 May 3, 000 Averaging Measurements with Hidden Correlations and Asymmetric Errors Michael Schmelling / MPI for Nuclear Physics Postfach 03980, D-6909 Heidelberg arxiv:hep-ex/0006004v Jun

More information

(Part 1) High-dimensional statistics May / 41

(Part 1) High-dimensional statistics May / 41 Theory for the Lasso Recall the linear model Y i = p j=1 β j X (j) i + ɛ i, i = 1,..., n, or, in matrix notation, Y = Xβ + ɛ, To simplify, we assume that the design X is fixed, and that ɛ is N (0, σ 2

More information

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b)

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b) LECTURE 5 NOTES 1. Bayesian point estimators. In the conventional (frequentist) approach to statistical inference, the parameter θ Θ is considered a fixed quantity. In the Bayesian approach, it is considered

More information

Optimality Conditions for Constrained Optimization

Optimality Conditions for Constrained Optimization 72 CHAPTER 7 Optimality Conditions for Constrained Optimization 1. First Order Conditions In this section we consider first order optimality conditions for the constrained problem P : minimize f 0 (x)

More information

Optimization: an Overview

Optimization: an Overview Optimization: an Overview Moritz Diehl University of Freiburg and University of Leuven (some slide material was provided by W. Bangerth and K. Mombaur) Overview of presentation Optimization: basic definitions

More information