Bayesian D-optimal Design

Similar documents
D-optimally Lack-of-Fit-Test-efficient Designs and Related Simple Designs

Basic concepts in estimation

Cramér-Rao Bounds for Estimation of Linear System Noise Covariances

No Information Sharing in Oligopoly: The Case of Price Competition with Cost Uncertainty

Double Kernel Method Using Line Transect Sampling

Variations. ECE 6540, Lecture 10 Maximum Likelihood Estimation

The Finite Sample Properties of the Least Squares Estimator / Basic Hypothesis Testing

Working Paper Evaluation of a Pseudo-R2 measure for panel probit models

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

Mixture Models & EM. Nicholas Ruozzi University of Texas at Dallas. based on the slides of Vibhav Gogate

econstor Make Your Publications Visible.

1 EM algorithm: updating the mixing proportions {π k } ik are the posterior probabilities at the qth iteration of EM.

Estimation and Detection

Least Squares. Ken Kreutz-Delgado (Nuno Vasconcelos) ECE 175A Winter UCSD

Covariance function estimation in Gaussian process regression

1. Positive and regular linear operators.

A simple method for solving the diophantine equation Y 2 = X 4 + ax 3 + bx 2 + cx + d

econstor Make Your Publications Visible.

Maximum Likelihood Diffusive Source Localization Based on Binary Observations

Statistics 203: Introduction to Regression and Analysis of Variance Penalized models

A Comparison of Multiple-Model Target Tracking Algorithms

econstor Make Your Publications Visible.

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

Matroids on graphs. Brigitte Servatius Worcester Polytechnic Institute. First Prev Next Last Go Back Full Screen Close Quit

F denotes cumulative density. denotes probability density function; (.)

Model structure. Lecture Note #3 (Chap.6) Identification of time series model. ARMAX Models and Difference Equations

Wavelet-Based Nonparametric Modeling of Hierarchical Functions in Colon Carcinogenesis

Estimation Tasks. Short Course on Image Quality. Matthew A. Kupinski. Introduction

EIE6207: Estimation Theory

Confidence Estimation Methods for Neural Networks: A Practical Comparison

Estimation of Parameters

Economics 620, Lecture 2: Regression Mechanics (Simple Regression)

557: MATHEMATICAL STATISTICS II BIAS AND VARIANCE

1. The Bergman, Kobayashi-Royden and Caratheodory metrics

Algebra. Übungsblatt 10 (Lösungen)

Exponential Family and Maximum Likelihood, Gaussian Mixture Models and the EM Algorithm. by Korbinian Schwinger

Time series models in the Frequency domain. The power spectrum, Spectral analysis

Regulation of Nanomaterials the relevance of LCA and RA

PARAMETER ESTIMATION AND ORDER SELECTION FOR LINEAR REGRESSION PROBLEMS. Yngve Selén and Erik G. Larsson

The Problem of Classification when the Data are Non-precise

Analysis II. Bearbeitet von Herbert Amann, Joachim Escher

OPTIMAL DESIGN INPUTS FOR EXPERIMENTAL CHAPTER 17. Organization of chapter in ISSO. Background. Linear models

Parameter Estimation and Fitting to Data

Module 2. Random Processes. Version 2, ECE IIT, Kharagpur

Variable Selection and Model Building

Machine Learning Linear Classification. Prof. Matteo Matteucci

Biophysics of Macromolecules

Static Program Analysis

44 CHAPTER 2. BAYESIAN DECISION THEORY

Static Program Analysis

arxiv:hep-ex/ v1 2 Jun 2000

Parameter estimation! and! forecasting! Cristiano Porciani! AIfA, Uni-Bonn!

The Expectation-Maximization Algorithm

F & B Approaches to a simple model

Business Economics BUSINESS ECONOMICS. PAPER No. : 8, FUNDAMENTALS OF ECONOMETRICS MODULE No. : 3, GAUSS MARKOV THEOREM

Philosophiekolloquium FB Philosophie KGW

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

Superlinear Parabolic Problems

Bayesian Gaussian / Linear Models. Read Sections and 3.3 in the text by Bishop

cemmap working paper, Centre for Microdata Methods and Practice, No. CWP11/04

2 Statistical Estimation: Basic Concepts

Lecture 8: Information Theory and Statistics

CRAMÉR-RAO BOUNDS FOR RADAR ALTIMETER WAVEFORMS. Corinne Mailhes (1), Jean-Yves Tourneret (1), Jérôme Severini (1), and Pierre Thibaut (2)

Density Estimation: ML, MAP, Bayesian estimation

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Reflecting Telescope Optics II

Consider the joint probability, P(x,y), shown as the contours in the figure above. P(x) is given by the integral of P(x,y) over all values of y.

An Introduction to Kinetic Monte Carlo Simulations of Surface Reactions

Multiple Linear Regression for the Supervisor Data

Density Estimation. Seungjin Choi

Overview. Probabilistic Interpretation of Linear Regression Maximum Likelihood Estimation Bayesian Estimation MAP Estimation

Econometric Methods. Prediction / Violation of A-Assumptions. Burcu Erdogan. Universität Trier WS 2011/2012

Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2

Bayesian Approach 2. CSC412 Probabilistic Learning & Reasoning

SS BMMM01 Basismodul Mathematics/Methods Block 1: Mathematics for Economists. Prüfer: Prof. Dr.

DART_LAB Tutorial Section 5: Adaptive Inflation

Differential Scanning Calorimetry

Linear Models for Regression CS534

Bayes Decision Theory

f-domain expression for the limit model Combine: 5.12 Approximate Modelling What can be said about H(q, θ) G(q, θ ) H(q, θ ) with

Mathematical statistics

ECE531 Lecture 10b: Maximum Likelihood Estimation

3. Applications of the integral representation theorems

Basics: Definitions and Notation. Stationarity. A More Formal Definition

Bayesian reconstruction of free energy profiles from umbrella samples

IEOR 165 Lecture 7 1 Bias-Variance Tradeoff

Statistical Models with Uncertain Error Parameters (G. Cowan, arxiv: )

Expectation. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Lecture 3. G. Cowan. Lecture 3 page 1. Lectures on Statistical Data Analysis

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

Regression. Oscar García

Linear Classifiers as Pattern Detectors

Statistics 262: Intermediate Biostatistics Model selection

3. Rational curves on quartics in $P^3$

Klausur zur Vorlesung Vertiefung Theoretische Informatik Sommersemester 2016

LECTURE NOTES FYS 4550/FYS EXPERIMENTAL HIGH ENERGY PHYSICS AUTUMN 2013 PART I A. STRANDLIE GJØVIK UNIVERSITY COLLEGE AND UNIVERSITY OF OSLO

THE ROYAL STATISTICAL SOCIETY 2016 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE MODULE 5

Frequentist-Bayesian Model Comparisons: A Simple Example

Exponential Time Series Analysis in Lattice QCD

1. Einleitung. 1.1 Organisatorisches. Ziel der Vorlesung: Einführung in die Methoden der Ökonometrie. Voraussetzungen: Deskriptive Statistik

Transcription:

Bayesian D-optimal Design Susanne Zaglauer, Michael Deflorian Abstract D-optimal and model based experimental designs are often criticised because of their dependency to the statistical model and the lac of the explicit allocation of the identified irregularities Furthermore D-optimal experimental designs tend to weight the boundary area of the experimental space significantly meaning that in extreme cases the boundary points of the experimental space are the experimental candidates In order to defuse this critique a Bayesian modification of the D-optimal design is introduced in this paper, which serves the flexibility and which is concurrently more resistant against the bias caused by the model In addition it allows the experimenter to play safe against the model approach In doing so the Bayesian design adds central points or other control points to the D-optimal design to decide if the model has uncertainties An advantage of this approach is the small modification of the D- optimal design to increase the independency of the D-optimal design to the chosen model approach he presented experimental design has been applied to the identification of the dynamic behavior of combustion engines and has been evaluated by simulations Kurzfassung D-optimale und andere modellbasierte Versuchspläne werden oft anhand ihrer Abhängigeit vom statistischen Modell und der fehlenden expliziten Bereitstellung der erannten Unregelmäßigeiten im Modell ritisiert Desweiteren ist bei D-optimalen Versuchsplänen der rend, andbereiche des Versuchsraums überproportional zu gewichten, das heißt, im Extremfall bilden die andpunte des Untersuchungsraums die Versuchsandidaten Um diesen Kritien zu begegnen, wird in diesem Beitrag eine Bayessche Modifiation des D-optimalen Ansatzes vorgestellt, die die Flexibilität bewahrt und gleichzeitig widerstandsfähiger gegenüber den vom Modell verursachten Bias ist Zusätzlich erlaubt sie dem Experimentator gegenüber dem Modellansatz auf Nummer sicher zu gehen Dabei werden bei dem Bayesschen D-optimalen Design Zentralpunte oder andere Kontrollpunte zum D-optimalen Design hinzugefügt, um entscheiden zu önnen, ob das Modell Unsicherheiten aufweist Ein Vorteil dieses Ansatzes ist es, dass D-optimale Ansätze nur geringe Modifiationen benötigen, um das D-optimale Design weniger abhängig vom gewählten Modellansatz zu machen Der vorgestellte Versuchsplan ommt bei der Identifiation des dynamischen Verhaltens von Verbrennungsmotoren zum Einsatz und wird simulativ evaluiert

Introduction he statistic Design of Experiments (DoE) gains clearly in importance in engine development he application area of the statistic DoE for modeling and systematic optimization in engine development goes way beyond the classic quality tass and is applied to almost every steps of development he goal is to achieve statistical verified results with low costs In order to achieve this goal Designs of Experiments are used herefore D-optimal designs are very popular but often criticized because of their dependency to the statistical model and the lac of the explicit allocation of the identified irregularities he target of this contribution is the introduction and presentation of a modification of the D-optimal design which compensates the mentioned disadvantages he second section basically contains a mathematical derivation of the Fischer Information Matrix which plays an important role in the subsequent presented Bayesian D-optimal design In the third section the previously introduced modification of the D-optimal design is evaluated by simulations and the results are demonstrated A summary of all advantages of the Bayesian D-optimal design and an outloo in future projects conclude the contribution Design of Experiments An important question is how the measuring points should be distributed efficiently in the experimental space he answer to this question is provided by DoE he goal is to cover the connections between target and influence factors systematically with as few experiments as possible his means to achieve the maximum of information of the system under investigation with every measurement he ind of DoE determines the distribution of the points in the experimental space In this contribution a Bayesian D-optimal design will be further examined Fischer Information he Fischer Information (FI) plays an important role in the Bayesian D-optimal design and therefore it is explained in this section It describes the information content of a random variable x at the parameter from which the Lielihood-function L( ) ) is dependent ln ) FI ( ) E () For an unbiased estimator ˆ ( x ) applies: ˆ ( x) ˆ( x) ) dx 0 () E

he Lielihood-function p ( ) describes the probability that for a given the random variable x is observed If p ( ) has a sharp pea it is easy to estimate the correct parameter, ie the data include a lot of information about If the Lielihood-function is flat and sweeping many data are needed to estimate the parameter (the data include few information about ) By differentiation of equation () we obtain: ˆ ( x) ) dx ˆ( x) ) dx As the Lielihood-function is a probability distribution it applies: ) dx (3) and through the relation (ln( x)) ' x ) dx ) ) ln ) (4) (5) his results in the correlation: ˆ ( x) ) ln ) dx (6) hrough factorization of the integrand one obtains: ˆ ( x) ) ) ln ) dx (7) By squaring the equation and usage of the Cauchy-Schwarz-inequality it is implied: ˆ( x) ) dx ln ) ) dx (8) he second factor of (8) is the Fischer Information FI ln ) ( ) ) dx (9) he first factor is the expected square error of the estimator, because ˆ( x) ˆ ) dx (0) E So the uncertainty in the parameters is characterized by the inverse of the Fischer Information his result is also presented by the Cramer-ao-inequality which says that the inverse of the Fischer Information is a lower bound at the variance of an unbiased estimator of and the precision with who can be estimated is limited by: ˆ ( x) FI ( ) Var ()

With N parameters with,,, N the Fischer Information is a N N matrix FIM ( ) E ln ) ln ) () and in equation () the variance must be chosen by the covariance matrix If the random vector x assumes a multivariate Gaussian distribution x ~ N( ( ), ( )) with mean value ( ) as nown function of the unnown parameter vector and the covariance matrix the probability distribution of x is then ) ( ) (det( )) ( x ) exp ( x ) 05n 05 (3) FIM ( ) ( ) ( ) tr (4) With uncorrelated Gaussian Noise ( ) it follows from equation (4) and [] ( ) ( ) ( ) FIM (5) It arises for the construction of the FIM as measure for the parameter uncertainty FIM n (6) f ) p p (7) with as estimated parameter vector of p, n as number of input-output-measures herefore the Fischer Information with multivariate Gaussian distribution is defined as in (5) f ) FIM (, ) p p f ) p p (8) Bayesian D-optimal Design In the following a Bayesian modification of the D-optimal Design is presented he D- optimal design maximizes det( ) (see []) should be chosen such that the uncertainty about θ is small and the increase of the determinate of ( ) reduces the error variances of the coefficients which are proportional to ( ) In this definition the design matrix has one column for every term in the assumed model his model is an approximation to the behavior of the real system It seems quite natural to as for the quality of the approximation But to maximize det( )

the whole experimental effort is spent by the precise estimation of the assumed model parameters herefore the D-optimal design does not supply the detected irregularities in the model o avoid this you may try with the Bayesian design to add central points or other control points both to the factorial and to the optimal design to decide whether the model has uncertainties Control points in a design are useful mainly for the protection against higher order effects which are included in the model hese terms are called potential terms he assumed model contains only primary terms ypically the sample size is not large enough to estimate all primary and potential terms simultaneously he issue here is to develop an approach which estimates the primary terms precisely during a general traceability and estimation is provided for the potential terms Suppose there are q potential terms that are just possibly important in addition to the p primary terms that you really want to fit It is ) whereby X has p q ( pri pot columns he sample size n is often not large enough to estimate all p q coefficients precisely, typically p n p q is hold As in [3] described an introduction of scaling and centering is necessary to allow the use of standardized prior distributions of the coefficients to wor well at all diverse contexts Without loss of generalization it would be supposed that the residual error of y is equal to such that every non constant term varies between and and every potential term is approximately uncorrelated with all primary terms herefore max( ) and min( ) hold for every non constant primary term pri pri and max( ) min( ) holds for every potential term with For every pot pair of primary and potential terms holds pro Kandidaten 0 pot pri (9) whereby ma min and are calculated for the design over the whole set of points In practice the scaling can be reached by execution of a regression of the potential terms against the primary terms with use of the potential points to evaluate and Z whereby ( pri ) pri pri pot (0) and r pot pri () Z r max( r) min( r) () whereas ma min and the definition of use the set of the potential points herefore the definition of leads to ( Z) instead of ) In addition pri is the lowest quadratic regression coefficient of the regression pot ( pri pot on and r is the residual of pri