Optimal experimental design, an introduction, Jesús López Fidalgo

Size: px

Start display at page:

Download "Optimal experimental design, an introduction, Jesús López Fidalgo"

Jonah Dean
6 years ago
Views:

1 Optimal experimental design, an introduction University of Castilla-La Mancha Department of Mathematics Institute of Applied Mathematics to Science and Engineering

2 Books (just a few) Fedorov V.V. (1972). Theory of optimal experiments. Academic press. New York. Pazman A. (1986). Foundations of Optimum Experimental Design. D. Reidel Pub. New York. Pukelsheim F. (1993). Optimal Design of Experiments. John Wiley & Sons. New York. Fedorov V.V. and Hackl P. (1997). Model-oriented design of experiments. Springer. New York. Atkinson A.C., Doney A.N., Tobias R. (2007). Optimum Experimental Designs, With SAS. Oxford science publications. Oxford.

3 OPTIMAL EXPERIMENTAL DESIGN THEORY

4 Concepts and notation Process: Select a model, E(y x) = f T (x)θ, Var(y x) = σ 2 (linear model). f continuous with components linearly independent. Select an experimental condition x on a design space χ (typically a compact on a Euclidian space). Examples: y = α0 + α 1x + ε, f (x) = (1, x) T, x χ = [a, b]. y = α1x 1 + α 2x 2 + α 3x 3 + ε, f (x) = x = (x 1, x 2, x 3) T, x χ = [a 1, b 1] [a 2, b 2] [a 3, b 3]. y = α0 + α 1x + α 2x 2 + ε, f (x) = (1, x, x 2 ) T, x χ = [a, b]. ANOVA (1-way, 3 levels): y = β 0 + β 1x 1 + β 2x 2 + ε, f (x 1, x 2) = (1, x 1, x 2) T, x = (x 1, x 2) χ = {(0, 1), (1, 0), (1, 1)}

5 Process Perform the experiment and observe the response (r.v. y). Repeat the process for the experimental conditions x 1,..., x n obtaining responses y 1,..., y n. Estimate the parameters (MLE) θ 1,..., θ m, σ 2 and make inferences.

6 EXPERIMENTAL DESIGN Aim: Choosing x 1,..., x n (exact design of size n). Some may be repeated (probability measure): ξ n (x) = nx n, where n x = times x appears in the design. Covariance matrix: Σˆθ = σ2 (X T X ) 1 = σ 2 n 1 M 1 (ξ), M(ξ) = x χ f (x)f T (x)ξ n (x) (information matrix): Symmetric. Non-definite positive. Singular if the number of different points in the design k < m.

7 Extended concept of experimental design Approximate design (Kiefer): probability measure ζ on the Borel field generated by the open sets of χ. Information matrix: M(ζ) = χ f (x)f T (x)ζ(dx). Convex set of approximate designs: Ξ. If ζ is discrete and ξ the probability function associated: M(ξ) = f (x)f T (x)ξ(x) x χ If ζ is absolutely continuous there exists a density function (pdf), ξ, associated: M(ξ) = f (x)f T (x)ξ(x)dx Compact and convex set of information matrices: M χ

8 Caratheodory s theorem Caratheodory s theorem: For any information matrix there is always a design with at most 1 2m(m + 1) + 1 different points. Finite support, { } x1 x ξ = 2... x k p 1 p 2... p k Restricting Ξ to finite designs M is still the same (Caratheodory, definition of integral and continuity). In practice, perform n i nξ(x i ) experiments at x i, i n i = n. From now on we will use ξ or ζ as convenience.

9 Example y = α 0 + α 1 x + ε, x χ = [0, 1], f (x) = (1, x) T. ξ = { }. M(ξ) = = = 3 ( 1 xi i=1 x i x 2 i ) p i ( ) ( ) ( ) ( ) 0.4

10 Example ξ = { If just n = 12 experiments may be performed, possible rounding off: }. about = 2.4 (2, 3, 3) experiments at 0, about = 4.8 (5, 4, 5) experiments at 0.5, about = 4.8 (5, 5, 4) experiments at 1. Remark: It is not as simple as the usual rounding off.

11 CRITERION FUNCTION Φ : Ξ [0, + ) or Φ : M [0, + ) to be minimized (maximized). The first is more general. The second has to be considered as a function of M 1, has to be non-increasing (better estimates of the parameters): M N Φ(M) Φ(N), may be global or partial (interest on part of the parameters), may be positive homogeneous: Φ(δM) = 1 Φ(M), δ > 0, δ M is within the cone of the non-negative definite matrices in the Euclidean space of the symmetric matrices. Thus, the usual concept of differentiability applies here. A Φ optimal design ζ minimizes Φ. Convex: Φ[(1 ɛ)ζ + ɛζ ] (1 ɛ)φ(ζ) + ɛφ(ζ ).

12 Equivalence theorem for differentiable criteria Sensitive function: ψ(x, ξ) = Φ[M(ξ), f (x)f T (x)] = f T (x) Φ[M(ξ)]f (x) trm(ξ) Φ[M(ξ)]. Equivalence theorem: ξ is Φ optimal if and only if ψ(x, ξ ) 0, x χ. Equality for x S ξ.

13 Comments on the equivalence theorem Just for approximate designs. M is convex and it is the same for finite or general designs. From now on we assume finite designs and use the symbol ξ (pdf) instead of ζ (measure). Still valid for a restricted search in a convex subset.

14 Checking condition ψ(z, ξ ) = 0, z S ξ ( ψ(x, ξ ) ) x x=z E.g., for a two parameter model assume ξ = then there are two equations: = 0, z S ξ Int(χ) { x1 x 2 1 p p }, ψ(x 1, ξ) = 0, ψ(x 2, ξ) = 0 and 0, 1 or 2 of the equations: ( ) ( ) ψ(x, ξ) ψ(x, ξ) = 0, = 0 x x=x 1 x x=x 2 to be solved.

15 EFFICIENCY (goodness of a design) eff Φ (ξ) = Φ[M(ξ )] Φ[M(ξ)]. If Φ homogenous, e.g. for 60% efficiency and n observations: 0.6 = Φ[M(ξ )] Φ[M(ξ)] = Φ(σ 2 nσ 1 ) ˆθ Φ(σ 2 nσ 1 ˆθ 1 ) ˆθ ) = σ2n 1 Φ(Σ σ 2 n 1 Φ(Σ 1 ) = Φ(Σ Φ(Σ 1 ˆθ ). For n observations with ξ and n with ξ the efficiency is 1 if: 1 = Φ(Σ 1 ) ˆθ Φ(Σ 1 ˆθ ) = σ2n 1Φ[M(ξ )] σ 2 n 1 Φ[M(ξ)] ˆθ = σ2 n 1 σ 2 0.6, n 1 n = 0.6n (60%) observations would be enough with ξ. 1 ) ˆθ

16 General properties for global criteria A Φ optimal design has m points at least: Any design with less than m points gives a singular matrix. For strictly convex criteria there is always a Φ optimal design with no more than m(m+1) 2 points: The optimal must be in the boundary (dimension m(m+1) 1). 2 Caratheodory in the boundary gives a limit of m(m+1) points. 2 If σ 2 (x) is not constant, the whole theory is applicable for f (x) = f (x)/σ(x). If the observations are correlated (Σ y ), just exact designs. Information matrix: X T Σ 1 y X.

17 D OPTIMALITY Determinant of the inverse: { log det M Φ D [M(ξ)] = 1 (ξ) = log det M(ξ) if det M(ξ) 0, if det M(ξ) = 0. Homogeneous definition: { det M Φ D [M(ξ)] = 1/m (ξ) if det M(ξ) 0, if det M(ξ) = 0.

18 Properties Φ D continuous on M. Convex on M and strictly convex on M + (nonsingular information matrices). Differentiable on M + : ( log det M) = M 1 Minimum volume of the confidence elipsoide: ( ˆΘ Θ) T (X T X ) 1 ( ˆΘ Θ) mf m,n m,γ S 2 R c 2. Volume =c m V m [det M 1 (ξ)] 1/2, where V m is the volume of the m dimensional sphere.

19 G optimality Kirstine Smith (1918). Kiefer & Wolfovitz gave the name. Φ G [M(ξ)] = { maxx χ f T (x)m 1 (ξ)f (x) if det M(ξ) 0, if det M(ξ) = 0. Φ G continuous and convex on M. Φ G [M(ξ)] max x χ var ξ ŷ(x).

20 A optimality Trace of the inverse: { } trm Φ A [M(ξ)] = 1 (ξ) if det M(ξ) 0 if det M(ξ) = 0, The second equality is a consequence of if the parameters are estimable. Φ A continuous on M. var ξ ˆθi e t i M 1 (ξ)e i Convex on M and strictly convex on M +. Differentiable on M + : (trm 1 (ξ)) = M 2 (ξ) m var ξ ˆθi. i=1

21 L optimality Atwood (1976): Φ L [M(ξ)] = { trwm 1 (ξ) if det M(ξ) 0, if det M(ξ) = 0, where W is a positive definite matrix of dimension m. Φ L continuous on M. Convex on M and strictly convex on M +. Differentiable on M + : [trwm 1 (ξ)] = M 1 (ξ)wm 1 (ξ).

22 c optimality Variance of the estimator of c T θ, linear combination of the parameters. Φ c [M(ξ)] = c T M (ξ)c.

23 Elfving procedure in practice Plot the curve x(t) = f 1 (t), y(t) = f 2 (t), t χ and its symmetric through origin. Plot the convex hull of both curves. Plot the line through the origin defined by vector c. Optimal design defined by the boundary point.

24 Elfving plot (One point design) { t } 1

25 Elfving plot (Two point design) { } a t 1 p p

26 Elfving procedure for three parameters y = θ 0 + θ 1 t + θ 2 t 2 + ε, t [0, 1] One, two or three design points.

27 Equivalence theorem for D and G optimality ξ is D optimal iff it is G optimal: where d(x, ξ) = f T (x)m 1 (ξ)f (x). Efficiency bound: eff D [M(ξ)] 2 max d(x, x χ ξ ) = m, maxx d(x,ξ) m

28 Example: Checking condition for a quadratic model E[y] = θ T f (x) = θ 1 + θ 2 x + θ 3 x 2, σ 2 (x) = 1, x χ = [ 1, 2], f (x) = (1, x, x 2 ) T. M(ξ) = = x [ 1,2] x [ 1,2] f (x)f T (x)ξ(x) ξ(x) 1 x x 2 x x 2 x 3 x 2 x 3 x 4 Assume the D-optimal design is of the type: { } 1 z 2 ξ = 1/3 1/3 1/3,

29 Quadratic model Maximize the determinant: det M(ξ) = 1 3 ( z 2 + z + 2 ) 2. The maximum is reached at z = 0.5: { ξ = 1/3 1/3 1/3 Checking whether ξ is the D-optimal: d(x, ξ ) = f T (x)m 1 (ξ )f (x) }, = 0.89 ( x x ) ( x x )

30 Example: Elfving procedure E[y] = θ T f (x) = θ 1 x + θ 2 x 2, σ 2 (x) = 1, x χ = [0, 1], f (x) = (1, x, x 2 ) T. Plot the curve x(t) = t, y(t) = t 2, t [0, 1]:

31 Symmetric through the origin

32 Convex hull Tangential point, z: x(z) x 0 x (z) = y(z) y 0 y (z) z = z2 + 1, z = 2 1 2z

33 Line through the origin defined by vector c = (0.5, 0.3) (λ0.5) 2 = λ0.3, t = 3/5, ξ c = { 3/5 1 }

34 Line through the origin defined by vector c = (0.5, 1) Cutting point: { y = 2x, x = 1 4 ( 2 2 x 1 z+1 = y 1 z 2 +1, ) = 0.15, y = 0.30.

35 Weights Convex combination: (0.15, 0.30) = (1 p)( z, z 2 ) + p(1, 1), p = 1 4 ( 3 ) 2 = 0.30 { ξc = ( ) ( ) }.

36 THANK YOU

Optimal experimental designs applied to Biostatistics.

Optimal experimental designs applied to Biostatistics. Thesis for the Degree of Doctor of Mathematics Santiago Campos Barreiro Academic advisor: Prof. Dr. Jesús López Fidalgo Institute of Mathematics Applied