Optimal Design for Nonlinear and Spatial Models: Introduction and Historical Overview
|
|
- Kathryn Amanda Heath
- 5 years ago
- Views:
Transcription
1 12 Optimal Design for Nonlinear and Spatial Models: Introduction and Historical Overview Douglas P. Wiens CONTENTS 12.1 Introduction Generalized Linear Models Selected Nonlinear Models Spatial Models References Introduction The topic of this part of the handbook optimal design for nonlinear and spatial models allows for a very broad range of subtopics. We should first distinguish these from those formulated for linear models. A salient feature of design problems for linear models is that the common functions expressing the experimenter s loss, when estimating the mean response, do not depend on the unknown parameters being estimated. In this chapter, a number of design problems are introduced in which this very convenient feature is absent, and ways of dealing with its absence are discussed in general terms. Thus, although we treat classical nonlinear regression models in which a response variable y is measured with additive error and E [y x] is a nonlinear function of parameters θ to be estimated after the experiment is conducted, there is a multitude of other applications. In this chapter, these subjects will be introduced in broad generality only, and some historical context provided; precise details and examples are given in the three chapters which follow: Designs for Generalized Linear Models (Chapter 13) Designs for Selected Nonlinear Models (Chapter 14) Optimal Design for Spatial Models (Chapter 15) Chapters 22, 24 and 25 deal with special applications that use nonlinear models Generalized Linear Models For a book-length treatment of generalized linear models (GLMs), we refer the reader to the now classic text McCullagh and Nelder (1989). Briefly, the response variable y, givena 457 Dean/Handbook of Design and Analysis of Experiments K14518_C012 Revises Page
2 458 Handbook of Design and Analysis of Experiments covariate vector x chosen by the experimenter, follows a distribution from the exponential family, with (canonical) density p ( y θ, φ, x ) { yθ b (θ) = exp + c ( y, φ )}, a (φ) for scalar functions a ( ), b ( ) and c (, ). The canonical parameter θ relates the systematic linear component η (x) = f (x) β, with regressors f (x) and regression parameters β,tothe mean μ = db(θ)/dθ via an invertible link function g, namely, η = g (μ). We write h = g 1 ; h (1) and h (2) are the first and second derivatives with respect to η. The parameters are typically estimated by maximum likelihood, computed from observations { } n y i i=1 made at points {x i} n i=1 chosen from a design space χ. The asymptotic variance of ˆβ is the inverse I 1 (β) of the information matrix: I (β) = X UX, where X is the model matrix, with ith row f (x i ) (i = 1,..., n)andu is the diagonal matrix of weights, withith diagonal element ( h (1) (η (x i )) ) 2 /Var [y xi ]. If the designer is primarily interested in precise estimation of β, then he or she will aim to maximize, in some sense, I (β); this leads to the adoption of classical alphabetic optimality criteria notably D-optimality, in which the goal is maximization of det (I (β)). The mean is estimated by ( ) ˆμ (x) = h f (x) ˆβ, with asymptotic variance and asymptotic bias given by (Robinson and Khuri 2003) Var [ ˆμ (x)] = ( h (1) (η (x))) 2 f (x) I 1 (β) f (x), Bias [ ˆμ (x)] = h (1) (η (x)) f (x) I 1 (β) X Uψ h(2) (η (x)) f (x) I 1 (β) f (x), where ψ n 1 has elements ψ i = h(2) (η (x i )) f (x i ) I 1 (β) f (x i ). 2 If interest focusses on prediction of mean values, then the designer will aim to minimize some function of the mean squared errors (MSEs) MSE [ ˆμ (x)] = Var [ ˆμ (x)] + Bias 2 [ ˆμ (x)], an obvious choice is the integral or average of MSE [ ˆμ (x)] over the design space χ. The class of GLMs spawns a wealth of particular applications and related design issues. Prominent among these is logistic regression, in which a binary response y has P ( y = 1 ) def = π = L (α + βx), Dean/Handbook of Design and Analysis of Experiments K14518_C012 Revises Page
3 Optimal Design for Nonlinear and Spatial Models 459 for L(η) = 1/ ( 1 + e η), the logistic distribution. Here μ = π and η = g (μ) = ln (π/(1 π)), the logit. One might seek a design a choice of values of x and a specification of the frequencies with which y is to be observed at these values in order to estimate the linear parameters efficiently, or to study functions of these parameters. For instance, in bioassay and dose response problems, interest often focusses on the covariate value x π0 = L 1 (π 0 ) α, β required to attain a response y = 1 in a specified proportion π 0 of the population. The role of the logistic distribution in the aforementioned may of course be played by other distributions; if L is replaced by the Gaussian distribution function, then one is dealing with probit regression, and similar design problems are of interest. One of the earliest instances of nonlinear regression design is for the exponential regression GLM Fisher (1922) considered a dilution-series problem, with P ( y = 1 ) = exp( θx) with θ, x > 0. This problem is also the subject of an example by Fedorov (1972, pp ), who notes that the information matrix for θ is a scalar, maximized by placing all observations at the solution x ( 1.6/θ) of the equation 2e θx + θx = 2. In the Poisson count model, y follows a Poisson distribution with mean μ = f (x) β, and the experimenter is typically interested in efficiently estimating functions of β. The optimal design will of course depend on which such function is of interest. Particular examples are discussed, for response surface exploration in an environmetric setting, by Myers (1999). In these problems, and indeed in virtually all design problems for GLMs, one begins by determining an optimal design under the assumption that certain parameters even those to be estimated from the experimental data are known beforehand. This, clearly untenable, assumption might then be dropped in a number of ways, all discussed in detail in the chapters which follow. One can content oneself with a locally optimal design, in which optimality is sought only at, or in a small neighbourhood of, these assumed parameter values. Alternatively one might design so as to minimize the maximum loss, with the maximum evaluated over a set of parameters a mild robustness criterion which is also discussed in Chapter 20. Another approach is to choose the design points sequentially, at each stage using parameter estimates derived from the preceding observations. A further possibility, when the loss function being minimized depends on unknown parameters, is to integrate them out, with respect to a prior, and to then minimize the average loss so obtained. This pseudo Bayesian criterion is discussed in Chapter 13 and is a topic to which we return in Section The field of optimal design for GLMs seems to have blossomed in the 1980s, and many contributors acknowledge a debt to Ford et al. (1989), who surveyed the then current state of research in a more general context of nonlinear design. Burridge and Sebastiani (1992) obtained locally D-optimal designs, that is, designs maximizing det (I (β)) for fixed values of β. For this, they pointed out that if the parameters are known, then the problem can be transformed to the D-optimality problem for a linear model with model matrix U 1/2 X; they applied methods developed for linear design theory to derive optimal designs in this transformed problem and then translated these back to the original problem. In a small simulation study, with a bivariate linear predictor η and canonical link η = μ 1/k for various values of k, the efficiencies turned out to be relatively insensitive to the settings of the parameter values. Dean/Handbook of Design and Analysis of Experiments K14518_C012 Revises Page
4 460 Handbook of Design and Analysis of Experiments Ford et al. (1992) refer to this transformed problem, in terms of U 1/2 X,asthecanonical form of the problem. They consider the structure of the induced design space in some depth and use methods of Elfving (1952) to obtain locally D-optimal and c-optimal designs; the latter are designs minimizing c I 1 (β) c for fixed c. As do Burridge and Sebastiani (1992), they concentrate on examples with two linear parameters (β 0, β 1 ); in both of these papers, the optimal designs turn out to be concentrated on one, two or three points. Atkinson and Haines (1996) apply this canonical approach to, among others, examples of multifactor experiments. A class of attractive alternatives to local optimality is given by sequential designs. The asymptotic theory related to this is most well developed for the case of D-optimality. Here it is supposed that one will obtain n 1 observations from an initial, static design. These are used to give initial estimates of the parameters, following which the remaining n n 1 observations are made sequentially, at each stage choosing the next design point so as to maximize the determinant of the information matrix evaluated at the current parameter estimates. Chaudhuri and Mykland (1993) show that, under certain conditions, the sequence of designs so obtained converges to the D-optimal design for the true parameters. These conditions include the requirement that n 1 /n 0asbothn and n 1 tend to infinity and an assumption that the parameter estimates be consistent. A consequence is that inferences made from a sequentially constructed design have the same asymptotic properties as if they were made following a static design an observation previously made by Wu (1985) in a related context. Sinha and Wiens (2002) extend the ideas of Chaudhuri and Mykland, and incorporate some uncertainty as to the nature of the parametric model. Dror and Steinberg (2008) introduce significant improvements to these methods; in particular their sequential procedure for design construction is easily adapted to multifactor experiments and to a range of possible models. One likely reason for the popularity of the D-optimality criterion in these problems is its invariance under non-singular transformations of the design space, leading to the possibility of transforming to the aforementioned canonical form of the problem. Failing this, other methods are available. Yang (2008) takes a direct algebraic approach to obtain A-optimal designs (minimizing trace ( I 1 (β) ) ) for logistic, probit and Laplace models with two linear parameters. Other criteria minimizing the integrated MSE [ ˆμ (x)], for instance rely more heavily on numerical methods of design construction. One sequential approach of considerable interest involves stochastic approximation see the discussion in Khuri et al. (2006) and, in a dose-finding framework, Cheung (2010). Once a design is constructed by this or another method it is of obvious interest to compare its performance with other candidate designs; the quantile dispersion graphs of Robinson and Khuri (2003) provide a possible means for doing this. Here, but a few of the many facets of design for GLMs have been touched upon; these, and the broad spectrum of topics discussed in Chapter 13, illustrate that design theory for GLMs continues to be an active and exciting area of research Selected Nonlinear Models Part of the richness of the theory of design for nonlinear models stems from the physical settings in which the various models arise, each resulting in unique approaches to the design problems. Some particular nonlinear regression models, of the form Dean/Handbook of Design and Analysis of Experiments K14518_C012 Revises Page
5 Optimal Design for Nonlinear and Spatial Models 461 y = η (x; θ) + ε, where ε is random error and η (x; θ) is an at least partially nonlinear function of a p-dimensional parameter vector θ, correspond to the following response functions: The response η (x; θ) = θ 1 θ 1 θ 2 ( e θ 2x e θ 1x ), θ 1, θ 2 > 0, x > 0, for which the design problem was studied by Box and Lucas (1959), is used in chemometrics to model reactions in which a substance decomposes from a state A to a state B and finally to a state C. The parameters θ 1 and θ 2 measure the rates of these two decompositions, and η is the mean yield in state B. The design variable x represents time; a consequence is that, in contrast to many other design problems, there is no possibility of replication only one observation can be made at a specific value of x. Here and elsewhere, we define f (x; θ) to be the gradient f (x; θ) = ( η (x; θ) θ 1,..., ) η (x; θ), (12.1) θ p and F (θ) to be the n p matrix with ith row f (x i ; θ), where x i denotes the settings of the variables in the ith run of the experiment. Box and Lucas make preliminary guesses θ = ( θ 1, 2) θ and adopt the local D-optimality criterion, which aims to maximize the determinant F ( θ ) F ( θ ). A motivation is that when the asymptotic distribution of the parameter estimates is employed and if the initial guesses are correct, then such a design results in confidence ellipsoids of minimum volume. When the de la Garza phenomenon holds or is assumed this is expanded upon and exploited in Chapter 14 an optimal design will have only p support points and thus F ( θ ) F ( θ ) ( = F θ ) 2 ; this simplifies the search for the optimal points, at least when p is small and when analytic rather than numerical, methods are being used. Box and Lucas obtained optimal points (x 1, x 2 ) through a combination of geometric and analytic arguments, and used this example to illustrate a stepwise journey to the optimum, through fitting a sequence of quadratic models, in x 1 and x 2, very similar to common practice in response surface exploration. The Michaelis Menten enzyme kinetic function is η (x; θ) = θ 1x θ 2 + x, θ 1, θ 2 > 0, x > 0, where x is the concentration of substrate, θ 1 the maximum reaction velocity (i.e., the horizontal asymptote as x ), and θ 2 is the half-saturation constant, that is, the value of x at which the mean velocity η attains one-half of its asymptotic value. An important feature of this model from a design standpoint is that it is nonlinear in θ 2 but not in θ 1, and then the loss function for D-optimality depends (up to a constant of proportionality) only on θ 2. Currie (1982) discusses various designs for Dean/Handbook of Design and Analysis of Experiments K14518_C012 Revises Page
6 462 Handbook of Design and Analysis of Experiments this model. Assuming homoscedastic normal errors, the maximum likelihood estimates are obtained by least squares, leading to the D-optimal design which places half of the observations at θ 2 and the other half at as large a value of x as possible. (Bates and Watts 1988, p. 126) state instead that the two locations are x 1 = x max, the maximum allowable value, and x 2 = θ 2 /(1 + 2 (θ 2 /x max )), in agreement with Currie if x 2 is evaluated at x max =.) An obvious drawback to this design, shared by others in which the number of distinct locations of the explanatory variables is no larger than p, is that there is no possibility to check the validity of the model a point which is also discussed in Chapter 20. Thus Currie discusses as well more ad hoc, but sensible, designs in which the majority of the design points are spread out over the low range of concentration, with the rest distributed throughout the higher range. He finds that the value of F (θ) F (θ) (evaluated at the assumed value of θ2 ) can be substantially smaller than that for the locally D-optimal design, but that the performance of this latter design can itself deteriorate markedly if the experimenter s guess at the value of θ 2 is inaccurate. An obvious remedy, if conditions permit, is to design sequentially, with past observations used to give improved estimates of θ 2. The Michaelis Menten model is used throughout Chapter 14 for illustration of the concepts there. The rational function response η (x; θ) = θ 1 θ 3 x θ 1 x 1 + θ 2 x 2, x 1, x 2 > 0, models chemical reactions of the type R P 1 + P, withη representing the speed of the reaction, x 1 the partial pressure of the sought product P, x 2 the partial pressure of the product P 1, θ 2 the absorption equilibrium constant for P 1, θ 3 the effective constant of the speed of reaction (appearing linearly) and θ 1 the absorption equilibrium constant for the reagent R (Fedorov 1972, pp ). Box and Hunter (1965) propose a sequential approach with, at each stage, new locations x = (x 1, x 2 ) ( ) ( chosen to maximize the resulting value of F ˆθ F ˆθ) evaluated at the current estimates ˆθ. Fedorov (1972) discusses this example in detail. Initial estimates θ of the parameter values are obtained from a preliminary experiment, with observations made at the four combinations of x 1, x 2 {1, 2}. Given a design specifying n observations, and resulting in parameter estimates ˆθ (n), the next location is given by ( )[ x n+1 = arg max f x; ˆθ ( ) ( )] 1 ) (n) F ˆθ (n) F ˆθ (n) f (x; ˆθ (n), x stopping once the changes in the parameter estimates become insignificant. The asymptotic optimality results of Chaudhuri and Mykland (1993) and Wu (1985), mentioned in Section 12.2, apply. Recall that the volume of a confidence ellipsoid on the parameters is proportional to F F 1/2. Even under exact normality, the coverage probability of such regions equals the nominal value only for linear response surfaces. Hamilton et al. (1982) obtain corrected, second-order expressions for the volume of such regions, with the Dean/Handbook of Design and Analysis of Experiments K14518_C012 Revises Page
7 Optimal Design for Nonlinear and Spatial Models 463 correction term, which is O p (n 1 ), depending on the degree of nonlinearity of the response. Hamilton and Watts (1985) then reconsider the sequential design procedure for this rational function example as an illustration of their quadratic design criterion, which aims to minimize the corrected value of the volume. They find that each subsequent observation diminishes the effect of the nonlinearity and also that the designs can be quite different from those of Box and Hunter. As in these examples, a preliminary goal of the experimenter might be to design for efficient estimation of the parameters; in this case, the same alphabetic optimality criteria as in linear regression are available. Or the experimenter might seek a design which aids in the selection of an appropriate model. When this is phrased as a discrimination problem, the mathematical goal could be the maximization of the power of a test of a hypothesis ( η = η 0 versus η = η 1, each specified up to its parameter values. If the densities p 0 y; η0 (x) ) ( and p 1 y; η1 (x) ) of y under the two models are both Gaussian, this leads to the notion of T-optimality (Atkinson and Fedorov 1975a,b). More generally (López-Fidalgo et al. 2007), it leads to KL-optimality, in which the goal is to find a design ξ maximizing inf I (η 0 (x θ 0 ), η 1 (x θ 1 )) ξ (dx), (12.2) θ 0 here I (η 0 (x), η 1 (x)) = { ( p1 y; η1 (x) ) } ( p 1 y; η1 (x) ) log ( p 0 y; η0 (x) ) dy, is the Kullback Leibler divergence, measuring the information which is lost when p 0 is used to approximate p 1. In (12.2), θ 1 is assumed known. Both static and sequential approaches are available; robustifications of this approach are discussed in Chapter 20. Whatever might be the parameter-dependent loss function, a possibility is to seek a design minimizing the average loss; namely, ξ 0 = arg min L (ξ; θ) π (θ) dθ, (12.3) ξ where L (ξ; θ) is the loss corresponding to a design ξ when the true model is parameterized by θ and π ( ) is a user-chosen function assigning greater weight to parameter values thought to be most plausible or perhaps values against which one desires greater protection. For instance, the choice L (ξ; θ) = log M (ξ; θ), where M (ξ; θ) p p = f (x; θ) f (x; θ) ξ (dx), χ gives an analogue of classical D-optimality. For this choice, an equivalence theorem (Läuter 1974; see also Section 7.3 of Cox and Reid 2000) applies and states that, under mild conditions, ξ 0 satisfies (12.3) if and only if d (x; ξ 0 ) = f (x; θ) M 1 (ξ 0 ; θ) f (x; θ) π (θ) dθ p, Dean/Handbook of Design and Analysis of Experiments K14518_C012 Revises Page
8 464 Handbook of Design and Analysis of Experiments at all points x in the design space, with equality at the support points of ξ 0. As a simple yet instructive example, suppose that one intends to fit an exponential response η (x; θ) = e θx, (12.4) with additive error, by least squares. Then in (12.1), p = 1, f (x; θ) = xe θx and the requirement becomes, in an obvious notation, [ ] x 2 e 2θx E π [ E ξ0 x 2 e 2θx] 1, (12.5) with equality at the support points. With a design region χ = (0, 1], (12.5) applied to a one-point design with all mass at x 0 χ becomes E π [e 2θ(x x 0) ] (x 0 /x) 2. Some calculus yields x 0 = min { 1, 1/E π [θ] }, as given in Chaloner (1993) and restated in Dette and Neugebauer (1997), where, as well, conditions on π are given under which this one-point design is optimal, that is, satisfies (12.3), within the class of all designs. These conditions fail if, for instance, π is uniform on = [1, θ max ],forθ max sufficiently large. Then numerical methods must be used to obtain the maximizer in (12.3) directly, with (12.5) checked for verification of the optimality. An overview of this approach to design, in which the weight functions π ( ) are chosen according to a Bayesian paradigm, is given in Chaloner and Verdinelli (1995). For multiparameter models and priors, the integrations in (12.3) can become a significant part of the problem, requiring methods such as Markov chain Monte Carlo see, for instance, the discussions in Atkinson and Haines (1996) and Atkinson et al. (1995). Another possibility is to design so as to test the assumed response function for lack of fit (O Brien 1995). Designs optimal for discrimination or for lack of fit testing are typically not very efficient for estimating the parameters of the final model; this leads to designs which optimize some mixture of these goals see Hill et al. (1968) and the discussion in Chapter 14 of the approach of Dette et al. (2005). Similar in nature to calibration problems in linear regression are dose finding studies, which are also discussed in Chapter 24. Here one seeks the value of x resulting in a specified mean response η (x; θ). If η is explicitly invertible in particular, if it is linear in the parameters then estimates of x may be obtained from those of θ, and so the design problem is concentrated on efficient estimation of a function of the parameters. Otherwise, a possible approach is to design sequentially, guided by stochastic approximation (Cheung 2010). A class of design problems, apparently first studied by Chernoff (1962), arises in quality control and concerns accelerated life testing. One assumes a, typically nonlinear, response relating the lifetime (y) of a product to stress levels (x) and possibly to other covariates. The experimenter can usually not wait for a product on test to fail under normal stress levels, and so attempts to obtain inferences upon subjecting the product to abnormally high stresses. The goal is accurate prediction of product lifetime at normal stress levels, so that there is a natural link here to the more general problem of designing experiments for purposes of extrapolation. Dean/Handbook of Design and Analysis of Experiments K14518_C012 Revises Page
9 Optimal Design for Nonlinear and Spatial Models 465 The list of design problems and applications goes on; these and others are expanded upon in Chapter 14, where as well the mathematical theory is outlined. Other useful references include Bates and Watts (1988) and Seber and Wild (1989), each of which discusses modelling, inference, computations and to some extent design, in a comprehensive manner Spatial Models Spatial models pose some unique problems, both in inference and in design. Cressie (1993, p. 313) distinguishes between spatial experimental design, in which locations are fixed and the design consists of an allocation of treatments to these locations, and spatial sampling,in which the designer is faced with a spatial stochastic process (a random field), from which he or she is to choose locations at which to make observations. Much of the impetus for spatial experimental design derived from agricultural experimentation, and hence a large debt is owed to R. A. Fisher, who introduced in the 1920s and 1930s the now common notions of replication, randomization, blocking, etc.; see Martin (1996). Randomized designs came to be replaced by more systematic layouts, the analysis of which led to particular requirements in accounting for the spatial dependence. One of such is neighbour balance the requirement that, for instance, each treatment occurs the same number of times next to each other treatment. This might arise because of competition or interference between treatments. The achievement of neighbour balance in a design can lead to interesting combinatorial problems; see, for instance, Druilhet and Walter (2012). Typically, efficiency of estimation of model parameters is not a particularly important goal in spatial studies; this is however the aim of many designs which take account of spatial information by instead adopting a particular structure of dependence between nearby observations. Commonly, the ensuing analysis utilizes generalized least squares estimates, tailored for the particular dependence structure assumed. An optimal design then might be one which minimizes a particular loss function associated with these estimates or predictions. In all these cases, there might be dependence on covariates besides location; a possible model of the mean response at location t, with treatment covariates x, might be E[y x,t] =f (x)θ 1 + g (t)θ 2. In this case the locations are fixed but the covariates x are to be chosen by the designer. That this is a nonlinear model arises from the spatial dependence between observations, hence the dependence of the loss on the unknown parameters of the correlation structure. In spatial sampling, as in spatial experimental design, efficiency might take a back seat to other goals dictated by the physical setting of the problem; see Thompson (1997) and Müller (2005). Geometry-based designs, often intended for exploratory purposes, might aim to be space filling. If model-free imputation of missing observations is the primary goal, then the designer might use probability sampling (Matérn 1960). When the probabilistic structure is known and prediction is the goal, then an information theoretic approach might be apt see Caselton and Zidek (1984), who propose the maximization of mutual information based on Shannon s entropy, and the environmental application in Zidek et al. (2000). Dean/Handbook of Design and Analysis of Experiments K14518_C012 Revises Page
10 466 Handbook of Design and Analysis of Experiments On the other hand, when efficient parameter estimation and parametric inference is the aim, we are in the realm of optimal sampling design. The first step is often the choice of a correlation function specifying the nature and degree of the dependencies between observations made at various locations. This function plays a central role in the prediction of the response at unsampled locations typically through kriging (i.e. best linear unbiased prediction) and hence on the construction of designs. The choice of a particular spatial model is discussed at length in the companion handbook Gelfand et al. (2010), and so we do not discuss this here. A common aim of the designer is to minimize the integral (or sum, if the set of locations is discrete) of the MSEs of the predictions over all locations in the region of interest. Minimizing the maximum MSE is another possibility. This MSE might arise from the spatial variation and its estimation; another contributing factor might be the estimation of the mean response E[y x,t], modelled parametrically. When a regression response is modelled, the usual alphabetic optimality criteria become germane. In some applications, physical interpretations of covariance function parameters are also important and can become the objective of the design. To give some idea of the flavour of the techniques, consider the following design problem studied by Müller (2005). A region in the Danube river basin in Austria currently has a network of 36 water quality monitoring stations. The locations are labelled relative to a grid overlying the region. To predict chloride concentration (y) at location x, the experimenter fits a regression model with spatially correlated errors and a parametric covariance function: y (x) = f (x) β + ε (x), Cov [ ε (x), ε ( x )] = c ( x, x ; θ ). For illustrative purposes, Müller redesigns this network of 36 stations in several ways. In all cases, an important feature is that there is no notion of replication only one monitoring station may be placed at a particular location. The first design illustrated is D-optimal, maximizing the determinant of the information matrix for β (with f (x) = ( 1, x ) ); this matrix of course depends on the covariance function, taken to be c ( x, x ; θ ) = θ 2 {1 3 2 θ 1 + θ 2, x = x, ( ) } x x 3 θ 3 ( x x θ 3 ) ,, 0 < x x θ3, x x > θ3. Exchange algorithms are introduced to carry out the optimization. The resulting design is in Figure 12.1; a notable feature is that the design calls for all stations to be concentrated at FIGURE 12.1 D-optimal network of chlorine monitoring stations. (From Muller, W.G., Environmetrics, 16, 495, 2005.) Dean/Handbook of Design and Analysis of Experiments K14518_C012 Revises Page
11 Optimal Design for Nonlinear and Spatial Models 467 FIGURE 12.2 Network of chlorine monitoring stations obtained via an expansion of the covariance kernel, followed by D-optimality. (From Muller, W.G., Environmetrics, 16, 495, 2005.) the boundary of the region, but to be somewhat evenly distributed on this boundary. Presumably the managers of such a network would be asked if they were perhaps duplicating efforts of others immediately across the geographic boundary of their region. Another method of D-optimal design construction in Müller (2005) relies on an expansion of the covariance function in terms of eigenfunctions {φ l (x)}, resulting in an approximation of the process as y (x) = f (x) β + γ l φ l (x) + e (x), p l=1 with uncorrelated errors {e (x)}. Here the {γ l } are uncorrelated random variables with variances given by the eigenvalues corresponding to the φ l. This representation allows for an analysis by random coefficient regression, leading to the design in Figure 12.2, exhibiting a greater coverage of the region than that of Figure There is a close relationship between spatial sampling and the design of computer experiments. Although there is no random error, in the usual sense, in such experiments, it is common to model the dependencies between the outputs of experiments, with distinct inputs, via spatial correlation structures. This then engenders a certain similarity in the design problems the inputs to the computer experiment, to be chosen by the designer, play much the same role as do the locations in spatial sampling. Designs for computer experiments are discussed in Section V. The computational demands involved in constructing spatial designs can be immense. Some techniques which have been attempted, with varying measures of success, are exchange algorithms, simulated annealing and genetic algorithms. These, and many of the topics touched on previously, are discussed at length in Chapter 15. References Atkinson, A. C., Demetrio, C. G. B., and Zocchi, S. S. (1995), Optimum dose levels when males and females differ in response, Applied Statistics, 44, Atkinson, A. C. and Fedorov, V. V. (1975a), The design of experiments for discriminating between two rival models, Biometrika, 62, Atkinson, A. C. and Fedorov, V. V. (1975b), Optimal design: Experiments for discriminating between several models, Biometrika, 62, Dean/Handbook of Design and Analysis of Experiments K14518_C012 Revises Page
12 468 Handbook of Design and Analysis of Experiments Atkinson, A. C. and Haines, L. M. (1996), Designs for nonlinear and generalized linear models, in: Design and Analysis of Experiments, Handbook of Statistics, Vol. 13, pp ; eds. Ghosh, S. and Rao, C. R., Elsevier/North-Holland. Bates, D. M. and Watts, D. G. (1988), Nonlinear Regression Analysis and Its Applications, Wiley, New York. Box, G. E. P. and Hunter, W. G. (1959), Design of experiments in non-linear situations, Biometrika, 46, Box, G. E. P. and Lucas, H. L. (1965), The experimental study of physical mechanisms, Technometrics, 7, Burridge, J. and Sebastiani, P. (1992), Optimal designs for generalized linear models, Journal of the Italian Statistical Society, 1, Caselton, W. F. and Zidek, J. V. (1984), Optimal monitoring network designs, Statistics and Probability Letters, 2, Chaudhuri, P. and Mykland, P. A. (1993), Nonlinear experiments: Optimal design and inference based on likelihood, Journal of the American Statistical Association, 88, Chaloner, K. (1993), A note on optimal bayesian design in nonlinear problems, Journal of Statistical Planning and Inference, 37, Chaloner, K. and Verdinelli, I. (1995), Bayesian experimental design: A review, Statistical Science, 10, Chernoff, H. (1962), Optimal accelerated life designs for estimation, Technometrics, 4, Cheung, Y. K. (2010), Stochastic approximation and modern model-based designs for dose-finding clinical trials, Statistical Science, 25, Cox, D. R. and Reid, N. (2000), The Theory of the Design of Experiments, Chapman & Hall. Currie, D. J. (1982), Estimating Michaelis-Menten parameters: Bias, variance and experimental design, Biometrics, 38, Cressie, N. (1993), Statistics for Spatial Data, Wiley, New York. Dette, H., Melas, V. B., and Wong, W.-K. (2005), Optimal design for goodness-of-fit of the Michaelis- Menten enzyme kinetic function, Journal of the American Statistical Association, 100, Dette, H. and Neugebauer, H.-M. (1997), Bayesian D-optimal designs for exponential regression models, Journal of Statistical Planning and Inference, 60, Dror, H. A. and Steinberg, D. M. (2008), Sequential experimental designs for generalized linear models, JournaloftheAmericanStatisticalAssociation, 103, Druilhet, P. and Walter, T. (2012), Efficient circular neighbour designs for spatial interference model, Journal of Statistical Planning and Inference, 142, Elfving, G. (1952), Optimal allocation in linear regression theory, Annals of Mathematical Statistics, 23, Fedorov, V. V. (1972), Theory of Optimal Experiments, Academic Press, New York. Fisher, R. A. (1922), On the mathematical foundations of theoretical statistics, Philosophical Transactions of the Royal Society of London, Series A, 22, Ford, I., Titterington, D. M., and Kitsos, C. P. (1989), Recent advances in nonlinear experimental design, Technometrics, 31, Gelfand, A. E., Diggle, P., Fuentes, M., and Guttorp, P. (2010), Handbook of Spatial Statistics, Chapman & Hall, New York. Hamilton, D. C. and Watts, D. G. (1985), A quadratic design criterion for precise estimation in nonlinear regression models, Technometrics, 27, Hamilton, D. C., Watts, D. G., and Bates, D. C. (1982), Accounting for intrinsic nonlinearity in nonlinear regression parameter inference regions, Annals of Statistics, 10, Hill, W. J., Hunter, W. G., and Wichern, D. W. (1968), A joint design criterion for the dual problem of model discrimination and parameter estimation, Technometrics, 10, Khuri, A. I., Mukherjee, B., Sinha, B. K., and Ghosh, M. (2006), Design issues for generalized linear models: A review, Statistical Science, 21, Läuter, E. (1974). Experimental design in a class of models, Mathematische Operations-forschung Statistik, 5, Dean/Handbook of Design and Analysis of Experiments K14518_C012 Revises Page
13 Optimal Design for Nonlinear and Spatial Models 469 López-Fidalgo, J., Tommasi, C., and Trandafir, P. C. (2007), An optimal experimental design criterion for discriminating between non-normal models, Journal of the Royal Statistical Society B, 69, Martin, R. J. (1996), Spatial experimental design, in: Design and Analysis of Experiments, Handbook of Statistics, Vol. 13, pp ; eds. Ghosh, S. and Rao, C. R., Elsevier/North-Holland. Matérn, B. (1960), Spatial Variation, Springer-Verlag, Berlin. McCullagh, P. and Nelder, J. A. (1989), Generalized Linear Models, Wiley, New York. Müller, W. G. (2005), A comparison of spatial design methods for correlated observations, Environmetrics, 16, Myers, R. H. (1999), Response surface methodology Current status and future directions, Journal of Quality Technology, 31, O Brien, T. E. (1995), Optimal design and lack of fit in nonlinear regression models, in: Proceedings of the 10th International Workshop on Statistical Modelling, Lecture Notes in Statistics,Springer-Verlag, New York, pp Robinson, K. S. and Khuri, A. I. (2003), Quantile dispersion graphs for evaluating and comparing designs for logistic regression models, Computational Statistics and Data Analysis, 43, Seber, G. A. F. and Wild, C. J. (1989), Nonlinear Regression, Wiley, New York. Sinha, S. and Wiens, D. P. (2002), Robust sequential designs for nonlinear regression, The Canadian Journal of Statistics, 30, Thompson, S. K. (1997), Effective sampling strategies for spatial studies, Metron, 55, Wu, C. F. J. (1985), Asymptotic inference from sequential design in a nonlinear situation, Biometrika, 72, Yang, M. (2008), A-optimal designs for generalized linear models with two parameters, Journal of Statistical Planning and Inference, 138, Zidek, J. V., Sun, W., and Le, N. D. (2000), Designing and integrating composite networks for monitoring multivariate gaussian pollution fields, Applied Statistics, 49, Dean/Handbook of Design and Analysis of Experiments K14518_C012 Revises Page
14 Dean/Handbook of Design and Analysis of Experiments K14518_C012 Revises Page
AP-Optimum Designs for Minimizing the Average Variance and Probability-Based Optimality
AP-Optimum Designs for Minimizing the Average Variance and Probability-Based Optimality Authors: N. M. Kilany Faculty of Science, Menoufia University Menoufia, Egypt. (neveenkilany@hotmail.com) W. A. Hassanein
More informationOptimum designs for model. discrimination and estimation. in Binary Response Models
Optimum designs for model discrimination and estimation in Binary Response Models by Wei-Shan Hsieh Advisor Mong-Na Lo Huang Department of Applied Mathematics National Sun Yat-sen University Kaohsiung,
More informationOPTIMAL DESIGNS FOR GENERALIZED LINEAR MODELS WITH MULTIPLE DESIGN VARIABLES
Statistica Sinica 21 (2011, 1415-1430 OPTIMAL DESIGNS FOR GENERALIZED LINEAR MODELS WITH MULTIPLE DESIGN VARIABLES Min Yang, Bin Zhang and Shuguang Huang University of Missouri, University of Alabama-Birmingham
More informationOptimum Designs for the Equality of Parameters in Enzyme Inhibition Kinetic Models
Optimum Designs for the Equality of Parameters in Enzyme Inhibition Kinetic Models Anthony C. Atkinson, Department of Statistics, London School of Economics, London WC2A 2AE, UK and Barbara Bogacka, School
More informationD-optimal Designs for Factorial Experiments under Generalized Linear Models
D-optimal Designs for Factorial Experiments under Generalized Linear Models Jie Yang Department of Mathematics, Statistics, and Computer Science University of Illinois at Chicago Joint research with Abhyuday
More informationCovariance function estimation in Gaussian process regression
Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian
More informationA-optimal designs for generalized linear model with two parameters
A-optimal designs for generalized linear model with two parameters Min Yang * University of Missouri - Columbia Abstract An algebraic method for constructing A-optimal designs for two parameter generalized
More informationOptimal Designs for 2 k Experiments with Binary Response
1 / 57 Optimal Designs for 2 k Experiments with Binary Response Dibyen Majumdar Mathematics, Statistics, and Computer Science College of Liberal Arts and Sciences University of Illinois at Chicago Joint
More informationLOGISTIC REGRESSION Joseph M. Hilbe
LOGISTIC REGRESSION Joseph M. Hilbe Arizona State University Logistic regression is the most common method used to model binary response data. When the response is binary, it typically takes the form of
More informationA geometric characterization of c-optimal designs for heteroscedastic regression
A geometric characterization of c-optimal designs for heteroscedastic regression Holger Dette Ruhr-Universität Bochum Fakultät für Mathematik 44780 Bochum, Germany e-mail: holger.dette@rub.de Tim Holland-Letz
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationGeneralized Linear Models. Kurt Hornik
Generalized Linear Models Kurt Hornik Motivation Assuming normality, the linear model y = Xβ + e has y = β + ε, ε N(0, σ 2 ) such that y N(μ, σ 2 ), E(y ) = μ = β. Various generalizations, including general
More informationBy Min Yang 1 and John Stufken 2 University of Missouri Columbia and University of Georgia
The Annals of Statistics 2009, Vol. 37, No. 1, 518 541 DOI: 10.1214/07-AOS560 c Institute of Mathematical Statistics, 2009 SUPPORT POINTS OF LOCALLY OPTIMAL DESIGNS FOR NONLINEAR MODELS WITH TWO PARAMETERS
More informationFractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling
Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction
More informationInformation in a Two-Stage Adaptive Optimal Design
Information in a Two-Stage Adaptive Optimal Design Department of Statistics, University of Missouri Designed Experiments: Recent Advances in Methods and Applications DEMA 2011 Isaac Newton Institute for
More informationSpatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields
Spatial statistics, addition to Part I. Parameter estimation and kriging for Gaussian random fields 1 Introduction Jo Eidsvik Department of Mathematical Sciences, NTNU, Norway. (joeid@math.ntnu.no) February
More informationOptimal designs for multi-response generalized linear models with applications in thermal spraying
Optimal designs for multi-response generalized linear models with applications in thermal spraying Holger Dette Ruhr-Universität Bochum Fakultät für Mathematik 44780 Bochum Germany email: holger.dette@ruhr-uni-bochum.de
More informationOptimal discrimination designs
Optimal discrimination designs Holger Dette Ruhr-Universität Bochum Fakultät für Mathematik 44780 Bochum, Germany e-mail: holger.dette@ruhr-uni-bochum.de Stefanie Titoff Ruhr-Universität Bochum Fakultät
More informationOptimal designs for rational regression models
Optimal designs for rational regression models Holger Dette, Christine Kiss Ruhr-Universität Bochum Fakultät für Mathematik 44780 Bochum, Germany holger.dette@ruhr-uni-bochum.de tina.kiss12@googlemail.com
More informationStatistics & Data Sciences: First Year Prelim Exam May 2018
Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book
More informationAsymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands
Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Elizabeth C. Mannshardt-Shamseldin Advisor: Richard L. Smith Duke University Department
More informationChapter 1 Statistical Inference
Chapter 1 Statistical Inference causal inference To infer causality, you need a randomized experiment (or a huge observational study and lots of outside information). inference to populations Generalizations
More informationEfficient computation of Bayesian optimal discriminating designs
Efficient computation of Bayesian optimal discriminating designs Holger Dette Ruhr-Universität Bochum Fakultät für Mathematik 44780 Bochum, Germany e-mail: holger.dette@rub.de Roman Guchenko, Viatcheslav
More informationThe Bayesian Approach to Multi-equation Econometric Model Estimation
Journal of Statistical and Econometric Methods, vol.3, no.1, 2014, 85-96 ISSN: 2241-0384 (print), 2241-0376 (online) Scienpress Ltd, 2014 The Bayesian Approach to Multi-equation Econometric Model Estimation
More informationSTA 4273H: Statistical Machine Learning
STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear
More informationCharles E. McCulloch Biometrics Unit and Statistics Center Cornell University
A SURVEY OF VARIANCE COMPONENTS ESTIMATION FROM BINARY DATA by Charles E. McCulloch Biometrics Unit and Statistics Center Cornell University BU-1211-M May 1993 ABSTRACT The basic problem of variance components
More information9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures
FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models
More informationLocal&Bayesianoptimaldesigns in binary bioassay
Local&Bayesianoptimaldesigns in binary bioassay D.M.Smith Office of Biostatistics & Bioinformatics Medicial College of Georgia. New Directions in Experimental Design (DAE 2003 Chicago) 1 Review of binary
More informationRegression. Oscar García
Regression Oscar García Regression methods are fundamental in Forest Mensuration For a more concise and general presentation, we shall first review some matrix concepts 1 Matrices An order n m matrix is
More informationLinear Methods for Prediction
Chapter 5 Linear Methods for Prediction 5.1 Introduction We now revisit the classification problem and focus on linear methods. Since our prediction Ĝ(x) will always take values in the discrete set G we
More informationMULTIPLE-OBJECTIVE DESIGNS IN A DOSE-RESPONSE EXPERIMENT
New Developments and Applications in Experimental Design IMS Lecture Notes - Monograph Series (1998) Volume 34 MULTIPLE-OBJECTIVE DESIGNS IN A DOSE-RESPONSE EXPERIMENT BY WEI ZHU AND WENG KEE WONG 1 State
More informationAnalysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems
Analysis of the AIC Statistic for Optimal Detection of Small Changes in Dynamic Systems Jeremy S. Conner and Dale E. Seborg Department of Chemical Engineering University of California, Santa Barbara, CA
More informationPQL Estimation Biases in Generalized Linear Mixed Models
PQL Estimation Biases in Generalized Linear Mixed Models Woncheol Jang Johan Lim March 18, 2006 Abstract The penalized quasi-likelihood (PQL) approach is the most common estimation procedure for the generalized
More informationPlausible Values for Latent Variables Using Mplus
Plausible Values for Latent Variables Using Mplus Tihomir Asparouhov and Bengt Muthén August 21, 2010 1 1 Introduction Plausible values are imputed values for latent variables. All latent variables can
More informationExperimental Design to Maximize Information
Experimental Design to Maximize Information P. Sebastiani and H.P. Wynn Department of Mathematics and Statistics University of Massachusetts at Amherst, 01003 MA Department of Statistics, University of
More informationA Few Notes on Fisher Information (WIP)
A Few Notes on Fisher Information (WIP) David Meyer dmm@{-4-5.net,uoregon.edu} Last update: April 30, 208 Definitions There are so many interesting things about Fisher Information and its theoretical properties
More informationJoint Statistical Meetings - Section on Statistics & the Environment
Robust Designs for Approximate Regression Models With Correlated Errors Douglas P. Wiens Department of Mathematical and tatistical ciences University of Alberta, Edmonton, Alberta, Canada T6G G1 doug.wiens@ualberta.ca
More informationD-optimal designs for logistic regression in two variables
D-optimal designs for logistic regression in two variables Linda M. Haines 1, M. Gaëtan Kabera 2, P. Ndlovu 2 and T. E. O Brien 3 1 Department of Statistical Sciences, University of Cape Town, Rondebosch
More informationPART I INTRODUCTION The meaning of probability Basic definitions for frequentist statistics and Bayesian inference Bayesian inference Combinatorics
Table of Preface page xi PART I INTRODUCTION 1 1 The meaning of probability 3 1.1 Classical definition of probability 3 1.2 Statistical definition of probability 9 1.3 Bayesian understanding of probability
More informationNon-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models
Optimum Design for Mixed Effects Non-Linear and generalized Linear Models Cambridge, August 9-12, 2011 Non-maximum likelihood estimation and statistical inference for linear and nonlinear mixed models
More informationOptimal and efficient designs for Gompertz regression models
Ann Inst Stat Math (2012) 64:945 957 DOI 10.1007/s10463-011-0340-y Optimal and efficient designs for Gompertz regression models Gang Li Received: 13 July 2010 / Revised: 11 August 2011 / Published online:
More informationHANDBOOK OF APPLICABLE MATHEMATICS
HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester
More informationEmpirical Likelihood Methods for Sample Survey Data: An Overview
AUSTRIAN JOURNAL OF STATISTICS Volume 35 (2006), Number 2&3, 191 196 Empirical Likelihood Methods for Sample Survey Data: An Overview J. N. K. Rao Carleton University, Ottawa, Canada Abstract: The use
More informationON THE CONSEQUENCES OF MISSPECIFING ASSUMPTIONS CONCERNING RESIDUALS DISTRIBUTION IN A REPEATED MEASURES AND NONLINEAR MIXED MODELLING CONTEXT
ON THE CONSEQUENCES OF MISSPECIFING ASSUMPTIONS CONCERNING RESIDUALS DISTRIBUTION IN A REPEATED MEASURES AND NONLINEAR MIXED MODELLING CONTEXT Rachid el Halimi and Jordi Ocaña Departament d Estadística
More informationModel Selection for Semiparametric Bayesian Models with Application to Overdispersion
Proceedings 59th ISI World Statistics Congress, 25-30 August 2013, Hong Kong (Session CPS020) p.3863 Model Selection for Semiparametric Bayesian Models with Application to Overdispersion Jinfang Wang and
More informationGeneralized Linear Models Introduction
Generalized Linear Models Introduction Statistics 135 Autumn 2005 Copyright c 2005 by Mark E. Irwin Generalized Linear Models For many problems, standard linear regression approaches don t work. Sometimes,
More informationRestricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model
Restricted Maximum Likelihood in Linear Regression and Linear Mixed-Effects Model Xiuming Zhang zhangxiuming@u.nus.edu A*STAR-NUS Clinical Imaging Research Center October, 015 Summary This report derives
More informationESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS
ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, N.C.,
More informationParametric Techniques
Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure
More informationBrief Review on Estimation Theory
Brief Review on Estimation Theory K. Abed-Meraim ENST PARIS, Signal and Image Processing Dept. abed@tsi.enst.fr This presentation is essentially based on the course BASTA by E. Moulines Brief review on
More informationRegression: Lecture 2
Regression: Lecture 2 Niels Richard Hansen April 26, 2012 Contents 1 Linear regression and least squares estimation 1 1.1 Distributional results................................ 3 2 Non-linear effects and
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationDiagnostics can identify two possible areas of failure of assumptions when fitting linear models.
1 Transformations 1.1 Introduction Diagnostics can identify two possible areas of failure of assumptions when fitting linear models. (i) lack of Normality (ii) heterogeneity of variances It is important
More informationParametric Techniques Lecture 3
Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to
More informationContents. Part I: Fundamentals of Bayesian Inference 1
Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian
More informationarxiv: v1 [stat.me] 24 May 2010
The role of the nugget term in the Gaussian process method Andrey Pepelyshev arxiv:1005.4385v1 [stat.me] 24 May 2010 Abstract The maximum likelihood estimate of the correlation parameter of a Gaussian
More informationMinimax design criterion for fractional factorial designs
Ann Inst Stat Math 205 67:673 685 DOI 0.007/s0463-04-0470-0 Minimax design criterion for fractional factorial designs Yue Yin Julie Zhou Received: 2 November 203 / Revised: 5 March 204 / Published online:
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationA General Criterion for Factorial Designs Under Model Uncertainty
A General Criterion for Factorial Designs Under Model Uncertainty Steven Gilmour Queen Mary University of London http://www.maths.qmul.ac.uk/ sgg and Pi-Wen Tsai National Taiwan Normal University Fall
More informationNoninformative Priors for the Ratio of the Scale Parameters in the Inverted Exponential Distributions
Communications for Statistical Applications and Methods 03, Vol. 0, No. 5, 387 394 DOI: http://dx.doi.org/0.535/csam.03.0.5.387 Noninformative Priors for the Ratio of the Scale Parameters in the Inverted
More informationPattern Recognition and Machine Learning
Christopher M. Bishop Pattern Recognition and Machine Learning ÖSpri inger Contents Preface Mathematical notation Contents vii xi xiii 1 Introduction 1 1.1 Example: Polynomial Curve Fitting 4 1.2 Probability
More informationOpen Problems in Mixed Models
xxiii Determining how to deal with a not positive definite covariance matrix of random effects, D during maximum likelihood estimation algorithms. Several strategies are discussed in Section 2.15. For
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationPROBABILITY AND STATISTICS Vol. III - Statistical Experiments and Optimal Design - Andrej Pázman STATISTICAL EXPERIMENTS AND OPTIMAL DESIGN
STATISTICAL EXPERIMENTS AND OPTIMAL DESIGN Andrej Pázman Comenius University, Bratislava, Slovakia Keywords: Experiment design, linear statistical model, nonlinear regression, least squares, information
More informationModel Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model
Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model Centre for Molecular, Environmental, Genetic & Analytic (MEGA) Epidemiology School of Population
More informationDesigns for Generalized Linear Models
Designs for Generalized Linear Models Anthony C. Atkinson David C. Woods London School of Economics and Political Science, UK University of Southampton, UK December 9, 2013 Email: a.c.atkinson@lse.ac.uk
More informationEconometrics I, Estimation
Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the
More informationComputationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models
Computationally Efficient Estimation of Multilevel High-Dimensional Latent Variable Models Tihomir Asparouhov 1, Bengt Muthen 2 Muthen & Muthen 1 UCLA 2 Abstract Multilevel analysis often leads to modeling
More informationD-optimal Designs for Multinomial Logistic Models
D-optimal Designs for Multinomial Logistic Models Jie Yang University of Illinois at Chicago Joint with Xianwei Bu and Dibyen Majumdar October 12, 2017 1 Multinomial Logistic Models Cumulative logit model:
More informationMFM Practitioner Module: Risk & Asset Allocation. John Dodson. February 18, 2015
MFM Practitioner Module: Risk & Asset Allocation February 18, 2015 No introduction to portfolio optimization would be complete without acknowledging the significant contribution of the Markowitz mean-variance
More informationBayesian Sequential Design under Model Uncertainty using Sequential Monte Carlo
Bayesian Sequential Design under Model Uncertainty using Sequential Monte Carlo, James McGree, Tony Pettitt October 7, 2 Introduction Motivation Model choice abundant throughout literature Take into account
More informationRandomisation, Replication, Response Surfaces. and. Rosemary
Randomisation, Replication, Response Surfaces and Rosemary 1 A.C. Atkinson a.c.atkinson@lse.ac.uk Department of Statistics London School of Economics London WC2A 2AE, UK One joint publication RAB AND ME
More informationGeneralized Linear Models (GLZ)
Generalized Linear Models (GLZ) Generalized Linear Models (GLZ) are an extension of the linear modeling process that allows models to be fit to data that follow probability distributions other than the
More informationREGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University
REGRESSION ITH SPATIALL MISALIGNED DATA Lisa Madsen Oregon State University David Ruppert Cornell University SPATIALL MISALIGNED DATA 10 X X X X X X X X 5 X X X X X 0 X 0 5 10 OUTLINE 1. Introduction 2.
More informationarxiv: v1 [math.st] 22 Dec 2018
Optimal Designs for Prediction in Two Treatment Groups Rom Coefficient Regression Models Maryna Prus Otto-von-Guericke University Magdeburg, Institute for Mathematical Stochastics, PF 4, D-396 Magdeburg,
More informationOutline of GLMs. Definitions
Outline of GLMs Definitions This is a short outline of GLM details, adapted from the book Nonparametric Regression and Generalized Linear Models, by Green and Silverman. The responses Y i have density
More informationLecture 7 Introduction to Statistical Decision Theory
Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7
More informationPENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA
PENALIZED LIKELIHOOD PARAMETER ESTIMATION FOR ADDITIVE HAZARD MODELS WITH INTERVAL CENSORED DATA Kasun Rathnayake ; A/Prof Jun Ma Department of Statistics Faculty of Science and Engineering Macquarie University
More informationi=1 h n (ˆθ n ) = 0. (2)
Stat 8112 Lecture Notes Unbiased Estimating Equations Charles J. Geyer April 29, 2012 1 Introduction In this handout we generalize the notion of maximum likelihood estimation to solution of unbiased estimating
More informationOn prediction and density estimation Peter McCullagh University of Chicago December 2004
On prediction and density estimation Peter McCullagh University of Chicago December 2004 Summary Having observed the initial segment of a random sequence, subsequent values may be predicted by calculating
More informationJoint work with Nottingham colleagues Simon Preston and Michail Tsagris.
/pgf/stepx/.initial=1cm, /pgf/stepy/.initial=1cm, /pgf/step/.code=1/pgf/stepx/.expanded=- 10.95415pt,/pgf/stepy/.expanded=- 10.95415pt, /pgf/step/.value required /pgf/images/width/.estore in= /pgf/images/height/.estore
More informationLinear Regression Models
Linear Regression Models Model Description and Model Parameters Modelling is a central theme in these notes. The idea is to develop and continuously improve a library of predictive models for hazards,
More informationBayesian Inference for DSGE Models. Lawrence J. Christiano
Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Preliminaries. Probabilities. Maximum Likelihood. Bayesian
More informationFractal functional regression for classification of gene expression data by wavelets
Fractal functional regression for classification of gene expression data by wavelets Margarita María Rincón 1 and María Dolores Ruiz-Medina 2 1 University of Granada Campus Fuente Nueva 18071 Granada,
More informationLecture 4: Types of errors. Bayesian regression models. Logistic regression
Lecture 4: Types of errors. Bayesian regression models. Logistic regression A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting more generally COMP-652 and ECSE-68, Lecture
More information* Tuesday 17 January :30-16:30 (2 hours) Recored on ESSE3 General introduction to the course.
Name of the course Statistical methods and data analysis Audience The course is intended for students of the first or second year of the Graduate School in Materials Engineering. The aim of the course
More informationMath 423/533: The Main Theoretical Topics
Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)
More informationOn the efficiency of two-stage adaptive designs
On the efficiency of two-stage adaptive designs Björn Bornkamp (Novartis Pharma AG) Based on: Dette, H., Bornkamp, B. and Bretz F. (2010): On the efficiency of adaptive designs www.statistik.tu-dortmund.de/sfb823-dp2010.html
More informationBAYESIAN MODEL FOR SPATIAL DEPENDANCE AND PREDICTION OF TUBERCULOSIS
BAYESIAN MODEL FOR SPATIAL DEPENDANCE AND PREDICTION OF TUBERCULOSIS Srinivasan R and Venkatesan P Dept. of Statistics, National Institute for Research Tuberculosis, (Indian Council of Medical Research),
More informationResearch Article A Nonparametric Two-Sample Wald Test of Equality of Variances
Advances in Decision Sciences Volume 211, Article ID 74858, 8 pages doi:1.1155/211/74858 Research Article A Nonparametric Two-Sample Wald Test of Equality of Variances David Allingham 1 andj.c.w.rayner
More informationDensity Estimation. Seungjin Choi
Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/
More informationLinear Models 1. Isfahan University of Technology Fall Semester, 2014
Linear Models 1 Isfahan University of Technology Fall Semester, 2014 References: [1] G. A. F., Seber and A. J. Lee (2003). Linear Regression Analysis (2nd ed.). Hoboken, NJ: Wiley. [2] A. C. Rencher and
More informationBAYESIAN KRIGING AND BAYESIAN NETWORK DESIGN
BAYESIAN KRIGING AND BAYESIAN NETWORK DESIGN Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, N.C., U.S.A. J. Stuart Hunter Lecture TIES 2004
More informationLong-Run Covariability
Long-Run Covariability Ulrich K. Müller and Mark W. Watson Princeton University October 2016 Motivation Study the long-run covariability/relationship between economic variables great ratios, long-run Phillips
More informationA D-optimal design for estimation of parameters of an exponential-linear growth curve of nanostructures
A D-optimal design for estimation of parameters of an exponential-linear growth curve of nanostructures Li Zhu, Tirthankar Dasgupta, Qiang Huang Department of Statistics, Harvard University, Cambridge,
More informationGroup Sequential Designs: Theory, Computation and Optimisation
Group Sequential Designs: Theory, Computation and Optimisation Christopher Jennison Department of Mathematical Sciences, University of Bath, UK http://people.bath.ac.uk/mascj 8th International Conference
More informationIntroducing the Normal Distribution
Department of Mathematics Ma 3/13 KC Border Introduction to Probability and Statistics Winter 219 Lecture 1: Introducing the Normal Distribution Relevant textbook passages: Pitman [5]: Sections 1.2, 2.2,
More informationOptimal designs for estimating the slope in nonlinear regression
Optimal designs for estimating the slope in nonlinear regression Holger Dette Ruhr-Universität Bochum Fakultät für Mathematik 44780 Bochum, Germany e-mail: holger.dette@rub.de Viatcheslav B. Melas St.
More informationEFFICIENT GEOMETRIC AND UNIFORM DESIGN STRATEGIES FOR SIGMOIDAL REGRESSION MODELS
South African Statist. J. (2009) 43, 49 83 49 EFFICIENT GEOMETRIC AND UNIFORM DESIGN STRATEGIES FOR SIGMOIDAL REGRESSION MODELS Timothy E. O'Brien Department of Mathematics and Statistics, Loyola University
More informationGeneralized Linear Models
Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n
More information