Introduction to Design and Analysis of Computer Experiments
|
|
- Kelly Hudson
- 6 years ago
- Views:
Transcription
1 Introduction to Design and Analysis of Thomas Santner Department of Statistics The Ohio State University October 2010
2 Outline Empirical Experimentation
3 Empirical Experimentation Physical Experiments (agriculture, industrial, medicine) are the gold-standard for determining the relationship between a endpoint of scientific interest and potential factors that affect it.
4 Empirical Experimentation Physical Experiments (agriculture, industrial, medicine) are the gold-standard for determining the relationship between a endpoint of scientific interest and potential factors that affect it. Viewpoint Experimental output = true I/O relationship + noise where Y p (x) = η true (x) + ǫ(x) x η true (x) is the unknown true I/O relationship.
5 Empirical Experimentation Some Challenges for Physical Experiments There are scientifically unimportant nuisance ( blocking) factors- model these factors via x i
6 Empirical Experimentation Some Challenges for Physical Experiments There are scientifically unimportant nuisance ( blocking) factors- model these factors via x i Some factors affecting the outcome are not recognized randomize R x s to experimental units to prevent unrecognized factors from systematically affecting treatment comparisons
7 Empirical Experimentation Some Challenges for Physical Experiments There are scientifically unimportant nuisance ( blocking) factors- model these factors via x i Some factors affecting the outcome are not recognized randomize R x s to experimental units to prevent unrecognized factors from systematically affecting treatment comparisons Choice of sample size determine right-sized experiments to detect important treatment differences
8 Empirical Experimentation Some Challenges for Physical Experiments There are scientifically unimportant nuisance ( blocking) factors- model these factors via x i Some factors affecting the outcome are not recognized randomize R x s to experimental units to prevent unrecognized factors from systematically affecting treatment comparisons Choice of sample size determine right-sized experiments to detect important treatment differences Stochastic Models relating the inputs x to the output Y p (x)
9 Empirical Experimentation Stochastic Simulation Experiments popular tool for studying complex physical systems each of whose parts behave in a known stochastic manner but whose ensemble behavior is not understood analytically. (Heavily used in Industrial Engineering e.g., compare job shop set ups)
10 and Physical Experiments Computer Experiment use of (deterministic) computer simulators that implements a detailed mathematical model to study a input/output relationship of scientific interest (micro model vs macro model used in SS Experiments)
11 and Physical Experiments Computer Experiment use of (deterministic) computer simulators that implements a detailed mathematical model to study a input/output relationship of scientific interest (micro model vs macro model used in SS Experiments) Mathematical model is typically a PDE/ODE. Implementation via FE, CFD,...
12 and Physical Experiments Computer Experiment use of (deterministic) computer simulators that implements a detailed mathematical model to study a input/output relationship of scientific interest (micro model vs macro model used in SS Experiments) Mathematical model is typically a PDE/ODE. Implementation via FE, CFD,... Some applications have data from both (one or more) computer experiment(s) and (one or more) physical experiment(s) to investigate the true input/output relationship of scientific interest. Thus computer experiments are used as adjuncts or surrogates for physical experiments for studying input/output relationships
13 Designing an Acetabular Cup
14 Designing an Acetabular Cup Ong et al (2006) conducted an uncertainty analysis of the effects of engineering cup design, surgical, and patient variables on the stability of uncemented accetabular cups ( porous ingrowth cups which rely on growth of the bone into the prosthesis)
15 Designing an Acetabular Cup Engineering Cup design Inputs Length stem, Stem cross section, sphericity of acetabular cup, etc
16 Designing an Acetabular Cup Engineering Cup design Inputs Length stem, Stem cross section, sphericity of acetabular cup, etc Patient Inputs elastic modulus of the bone, friction between prosthesis s tem and the bone
17 Designing an Acetabular Cup Engineering Cup design Inputs Length stem, Stem cross section, sphericity of acetabular cup, etc Patient Inputs elastic modulus of the bone, friction between prosthesis s tem and the bone Surgical Inputs accuracy of reamed acetabular cup in pelvis (gouges area between rear of acetabular cup insert and bone provide space for later debris buildup)
18 Designing an Acetabular Cup Engineering Cup design Inputs Length stem, Stem cross section, sphericity of acetabular cup, etc Patient Inputs elastic modulus of the bone, friction between prosthesis s tem and the bone Surgical Inputs accuracy of reamed acetabular cup in pelvis (gouges area between rear of acetabular cup insert and bone provide space for later debris buildup) Output is one of three measures of the amount of ingrowth of the bone into the acetabular cup
19 Other Disciplinary Examples of Biomechanics Rawlinson, et al. (2006) simulate the stresses and strains experienced by knee prostheses of different engineering designs.
20 Other Disciplinary Examples of Biomechanics Rawlinson, et al. (2006) simulate the stresses and strains experienced by knee prostheses of different engineering designs. Aeronautical Engineering
21 Other Disciplinary Examples of Biomechanics Rawlinson, et al. (2006) simulate the stresses and strains experienced by knee prostheses of different engineering designs. Aeronautical Engineering Drug Development
22 Other Disciplinary Examples of Biomechanics Rawlinson, et al. (2006) simulate the stresses and strains experienced by knee prostheses of different engineering designs. Aeronautical Engineering Drug Development Economics Lempert, et al. (2000) introduce the Wonderland model which simulates world economic development under given scenarios of innovation and pollution
23 Other Disciplinary Examples of Biomechanics Rawlinson, et al. (2006) simulate the stresses and strains experienced by knee prostheses of different engineering designs. Aeronautical Engineering Drug Development Economics Lempert, et al. (2000) introduce the Wonderland model which simulates world economic development under given scenarios of innovation and pollution Tissue Engineering Spilker (2009) modeled the kinetics of fluid flow in two-phase model of cartilage under sinusoidal loading
24 Features of Computer Simulators x Code y(x) is a proxy for the physical process (based on simplified physics or biology)
25 Features of Computer Simulators x Code y(x) is a proxy for the physical process (based on simplified physics or biology) y(x) is presumed to be a biased representation of the true input-response relationship (the true relationship would be obtainable by a physical experiment)
26 Features of Computer Simulators x Code y(x) is a proxy for the physical process (based on simplified physics or biology) y(x) is presumed to be a biased representation of the true input-response relationship (the true relationship would be obtainable by a physical experiment) y(x) need not be replicated (the code is deterministic) (in contrast to stochastic computer simulations)
27 Features of Computer Simulators x Code y(x) is a proxy for the physical process (based on simplified physics or biology) y(x) is presumed to be a biased representation of the true input-response relationship (the true relationship would be obtainable by a physical experiment) y(x) need not be replicated (the code is deterministic) (in contrast to stochastic computer simulations) x can often be high-dimensional (often functional)
28 Features of Computer Simulators x Code y(x) is a proxy for the physical process (based on simplified physics or biology) y(x) is presumed to be a biased representation of the true input-response relationship (the true relationship would be obtainable by a physical experiment) y(x) need not be replicated (the code is deterministic) (in contrast to stochastic computer simulations) x can often be high-dimensional (often functional) When x is high-dimensional, screening is important
29 Features of Computer Simulators x Code y(x) is a proxy for the physical process (based on simplified physics or biology) y(x) is presumed to be a biased representation of the true input-response relationship (the true relationship would be obtainable by a physical experiment) y(x) need not be replicated (the code is deterministic) (in contrast to stochastic computer simulations) x can often be high-dimensional (often functional) When x is high-dimensional, screening is important Computer codes used in practice have running times that can range from minutes to days.
30 Based on Computer Output Suppose we have only computer output and no data from an associated physical experiment (momentarily forget about the possibility of data from an associated physical experiment)
31 Based on Computer Output Suppose we have only computer output and no data from an associated physical experiment (momentarily forget about the possibility of data from an associated physical experiment) Warning Can t determine code bias in such a setting
32 Based on Computer Output Suppose we have only computer output and no data from an associated physical experiment (momentarily forget about the possibility of data from an associated physical experiment) Warning Can t determine code bias in such a setting Notation Let the inputs and outputs for our training data be (x tr 1, y c (x tr 1 )),...,(x tr n, y c (x tr n )),
33 Based on Computer Output Suppose we have only computer output and no data from an associated physical experiment (momentarily forget about the possibility of data from an associated physical experiment) Warning Can t determine code bias in such a setting Notation Let the inputs and outputs for our training data be (x tr 1, y c (x tr 1 )),...,(x tr n, y c (x tr n )), Goal Predict y c (x 0 ) where x 0 is an untried ( new ) input
34 Based on Computer Output Idea Regard y c (x) as a realization(a draw ) from a random function Y c (x). This is a Bayesian model; the desired smoothness properties of y c (x) must be built into all functions drawn from Y c (x).
35 Based on Computer Output Provides flexibility draws from three urns (Y c (x) processes) Y R(h) Y R(h) Y x R(h) h
36 Based on Computer Output We use the training data to pick the exact Y c (x) urn we draw from.
37 Based on Computer Output We use the training data to pick the exact Y c (x) urn we draw from. A naive predictor ŷ c (x 0 ) = E{Y c (x 0 ) training data}
38 Based on Computer Output We use the training data to pick the exact Y c (x) urn we draw from. A naive predictor ŷ c (x 0 ) = E{Y c (x 0 ) training data} A measure of prediction uncertainty is Var(Y c (x 0 ) training data) (due to uncertainty about the model)
39 Based on Computer Output tmp Index 1 to 50
40 The simplest possible model for Y(x) is the regression-like model Y(x) = k β j f j (x) + Z(x) = β f(x) + Z(x) j=1 (Y(x) = Large Scale Trends + Smooth Deviations)
41 The simplest possible model for Y(x) is the regression-like model Y(x) = k β j f j (x) + Z(x) = β f(x) + Z(x) j=1 (Y(x) = Large Scale Trends + Smooth Deviations) f 1 (x),..., f k (x) are known regression functions,
42 The simplest possible model for Y(x) is the regression-like model Y(x) = k β j f j (x) + Z(x) = β f(x) + Z(x) j=1 (Y(x) = Large Scale Trends + Smooth Deviations) f 1 (x),..., f k (x) are known regression functions, β 1,..., β k are unknown regression coefficients, and
43 The simplest possible model for Y(x) is the regression-like model Y(x) = k β j f j (x) + Z(x) = β f(x) + Z(x) j=1 (Y(x) = Large Scale Trends + Smooth Deviations) f 1 (x),..., f k (x) are known regression functions, β 1,..., β k are unknown regression coefficients, and Usually, β f(x) = β 0 in computer experiments
44 Y(x) = k j=1 β jf j (x) + Z(x) = β f(x) + Z(x) In regression we regard deviations from the regression mean as measurement errors meaning that Z(x 1 ) and Z(x 2 ) are unrelated ( white noise ). In computer experiments we want a Z(x) model that exhibits smooth deviations (from the Large Scale Trends). The simplest such model for Z(x) is a stationary Gaussian Stochastic Process ( GaSP )
45 Stationary GaSPs Z(x) is stationarity if for any x, the random variable Z(x) has constant mean and constant variance, and Cor(Z(x 1 ), Z(x 2 ) depends onlyon x 1 x 2, i.e.,
46 Stationary GaSPs Z(x) is stationarity if for any x, the random variable Z(x) has constant mean and constant variance, and Cor(Z(x 1 ), Z(x 2 ) depends onlyon x 1 x 2, i.e., E {Z(x)} = 0 ( = E {Y(x)} = β f(x)(= β 0 ) )
47 Stationary GaSPs Z(x) is stationarity if for any x, the random variable Z(x) has constant mean and constant variance, and Cor(Z(x 1 ), Z(x 2 ) depends onlyon x 1 x 2, i.e., E {Z(x)} = 0 ( = E {Y(x)} = β f(x)(= β 0 ) ) Var(Z(x)) = σz 2 ( ) = Var(Y(x)) = σ 2 Z
48 Stationary GaSPs Z(x) is stationarity if for any x, the random variable Z(x) has constant mean and constant variance, and Cor(Z(x 1 ), Z(x 2 ) depends onlyon x 1 x 2, i.e., E {Z(x)} = 0 ( = E {Y(x)} = β f(x)(= β 0 ) ) Var(Z(x)) = σz 2 ( ) = Var(Y(x)) = σ 2 Z Cor(Z(x 1 ), Z(x 2 )) is determined by a correlation function, i.e., an R( ) that satisfies R(0) = 1 and Cor(Z(x 1 ), Z(x 2 )) = R(x 1 x 2 ) (= Cor(Y(x 1 ), Y(x 2 )) = R(x 1 x 2 )) ) (and Cov(Y(x 1 ), Y(x 2 )) = σz 2 R(x 1 x 2 )
49 Stationary GaSPs For any s 1 and x 1,...,x s Z (Z 1,..., Z s ) (Z(x 1 ),..., Z(x s )) MultVar Normal
50 Stationary GaSPs For any s 1 and x 1,...,x s Z (Z 1,..., Z s ) (Z(x 1 ),..., Z(x s )) MultVar Normal The process Y( ) is completely determined once we identify (β 0,σZ 2, R( )), i.e., ( ) Y (Y 1,..., Y s ) (Y(x 1 ),...,Y(x s )) N s Fβ,σZ 2 R where F = F(x 1,..., x s ) is s k and R i,j = R(x i x j ) is s s
51 Stationary GaSPs For any s 1 and x 1,...,x s Z (Z 1,..., Z s ) (Z(x 1 ),..., Z(x s )) MultVar Normal The process Y( ) is completely determined once we identify (β 0,σZ 2, R( )), i.e., ( ) Y (Y 1,..., Y s ) (Y(x 1 ),...,Y(x s )) N s Fβ,σZ 2 R where F = F(x 1,..., x s ) is s k and R i,j = R(x i x j ) is s s Typically assume R( ) = R( ξ) is known up to a finite vector of parameters ξ, i.e., R( ) is parametric
52 Popular Correlation Functions Let z x 1 x 2 = (z 1,..., z d )
53 Popular Correlation Functions Let z x 1 x 2 = (z 1,..., z d ) White Noise R(z) = { 1, z = 0 0, z 0 = R = I s
54 Popular Correlation Functions Let z x 1 x 2 = (z 1,..., z d ) White Noise R(z) = { 1, z = 0 0, z 0 = R = I s Product Power Gaussian Correlation Function d d R(z) = exp( ξ j zj 2 ) = exp( ξ j zj 2 ), j=1 j=1 z IR d
55 Popular Correlation Functions Product Cubic Correlation Function R(z ξ) = d R(z j ξ j ) where ξ = (ξ 1,...,ξ d ) > 0 and ( ) 2 ( ) 1 6 z ξ + 6 z 3 ξ, z ξ/2 ( ) R(z ξ) = z ξ, ξ/2 < z ξ 0, ξ < z for ξ > 0 and z IR. j=1
56 The Multivariate Normal Distribution Definition X = (X 1,...,X d ) has the multivariate normal distribution with mean µ and covariance matrix Σ, denoted X N d (µ,σ), if X joint pdf 1 exp{ 1 (2π) d 2 det(σ) 2 (x btµ) Σ 1 (x btµ) }
57 The Multivariate Normal Distribution Definition X = (X 1,...,X d ) has the multivariate normal distribution with mean µ and covariance matrix Σ, denoted X N d (µ,σ), if X joint pdf 1 exp{ 1 (2π) d 2 det(σ) 2 (x btµ) Σ 1 (x btµ) } Recall that if W = (W 1, W 2 ) N d (( µ1 µ 2 ) ( Σ11 Σ, 12 Σ 21 Σ 22 )) then given that W 2 = w 2 the conditional density of W 1 given W 2 ( = w 2, denoted [W 1 W 2 = w 2 ] is ) N d1 µ 1 + Σ 12 Σ 1 22 (x 2 µ 2 ),Σ 11 Σ 12 Σ 1 22 Σ 21
58 Application Naive Predictor ŷ c (x 0 ) = E{Y c (x 0 ) Y tr } = k β j f j (x 0 ) + r ( 0 R 1 Y tr Fβ ) j=1
59 Application Naive Predictor ŷ c (x 0 ) = E{Y c (x 0 ) Y tr } = k β j f j (x 0 ) + r ( 0 R 1 Y tr Fβ ) j=1 Uncertainty is Var(Y c (x 0 ) Y tr ) = σ 2 Z (1 r 0 R 1 r 0 )
60 Application Naive Predictor ŷ c (x 0 ) = E{Y c (x 0 ) Y tr } = k β j f j (x 0 ) + r ( 0 R 1 Y tr Fβ ) j=1 Uncertainty is Var(Y c (x 0 ) Y tr ) = σ 2 Z Usually don t know (β,σz 2,ξ) :-( (1 r 0 R 1 r 0 )
61 Inference Frequentist Inference Choice of urn equivalent to selection of (β 0,σZ 2,ξ). Use training data to estimate (β 0,σZ 2,ξ) (the urn ) which is the basis for our predictor ŷ c (x 0 ) = E{Y c (x 0 ) training data, β 0, σ Z 2, ξ}
62 Inference Frequentist Inference Choice of urn equivalent to selection of (β 0,σZ 2,ξ). Use training data to estimate (β 0,σZ 2,ξ) (the urn ) which is the basis for our predictor ŷ c (x 0 ) = E{Y c (x 0 ) training data, β 0, σ 2 Z, ξ} Bayesian Inference Places a prior distribution on (β 0,σZ 2,ξ) and uses the posterior [β 0,σZ 2,ξ training data] to select a collection of urns and averages the predictors from each urn ŷ c (x 0 ) = E{E{Y c (x 0 ) training data,β 0,σ 2 Z,ξ}} where the outer E{ } is wrt [β 0,σZ 2,ξ training data].
63 Illustration True Curve (solid); n = 5 training data (diamonds); REML-EBLUP with Gaussian correlation function (dotted) True and Predicted y() x
64 The frequentist plug-in ŷ(x 0 ) is the basic off-the-shelf predictor of y(x 0 ) based on training data
65 The frequentist plug-in ŷ(x 0 ) is the basic off-the-shelf predictor of y(x 0 ) based on training data When (β 0,σ 2 Z,ξ) are known, ŷ(x 0 ) has many theoretical optimality properties although when these parameters are estimated from the data, the list of optimality properties is much shorter
66 The frequentist plug-in ŷ(x 0 ) is the basic off-the-shelf predictor of y(x 0 ) based on training data When (β 0,σ 2 Z,ξ) are known, ŷ(x 0 ) has many theoretical optimality properties although when these parameters are estimated from the data, the list of optimality properties is much shorter How to use the data to estimate (β 0,σZ 2,ξ)? MLE? REML? PMLE? XV? No plug-in method overcomes the smoothing problem seen in the example, although Bayesian predictors can
67 The frequentist plug-in ŷ(x 0 ) is the basic off-the-shelf predictor of y(x 0 ) based on training data When (β 0,σ 2 Z,ξ) are known, ŷ(x 0 ) has many theoretical optimality properties although when these parameters are estimated from the data, the list of optimality properties is much shorter How to use the data to estimate (β 0,σZ 2,ξ)? MLE? REML? PMLE? XV? No plug-in method overcomes the smoothing problem seen in the example, although Bayesian predictors can Bayesian methodology can be used to make more legitimate predictors and honest estimates of prediction uncertainty
68 Other limitations: there are many cases where the y( ) does not look like a draw from a GaSP. Need predictors based on non-stationary models.
69 Other limitations: there are many cases where the y( ) does not look like a draw from a GaSP. Need predictors based on non-stationary models. Two approaches to the Design of a computer experiment, i.e., choice of {x tr i } n i=1.
70 Other limitations: there are many cases where the y( ) does not look like a draw from a GaSP. Need predictors based on non-stationary models. Two approaches to the Design of a computer experiment, i.e., choice of {x tr i } n i=1. Approach 1 geometric-choose v to be space-filling.
71 Other limitations: there are many cases where the y( ) does not look like a draw from a GaSP. Need predictors based on non-stationary models. Two approaches to the Design of a computer experiment, i.e., choice of {x tr i } n i=1. Approach 1 geometric-choose v to be space-filling. Approach 2 criteria-based designs for computer experiments, eg, find an input which achieves arg min x y c (x)}
72 Questions?
Thomas Santner Ohio State University Columbus OH. Joint Statistical Meetings Seattle, WA August 2006
Introductory Overview Lecture on Computer Experiments - The Modeling and Analysis of Data from Computer Experiments Thomas Santner Ohio State University Columbus OH Joint Statistical Meetings Seattle,
More informationCourse topics (tentative) The role of random effects
Course topics (tentative) random effects linear mixed models analysis of variance frequentist likelihood-based inference (MLE and REML) prediction Bayesian inference The role of random effects Rasmus Waagepetersen
More informationIntroduction. Chapter 1
Chapter 1 Introduction In this book we will be concerned with supervised learning, which is the problem of learning input-output mappings from empirical data (the training dataset). Depending on the characteristics
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 254 Part V
More informationCOMP 551 Applied Machine Learning Lecture 20: Gaussian processes
COMP 55 Applied Machine Learning Lecture 2: Gaussian processes Instructor: Ryan Lowe (ryan.lowe@cs.mcgill.ca) Slides mostly by: (herke.vanhoof@mcgill.ca) Class web page: www.cs.mcgill.ca/~hvanho2/comp55
More informationECE521 week 3: 23/26 January 2017
ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear
More informationdata lam=36.9 lam=6.69 lam=4.18 lam=2.92 lam=2.21 time max wavelength modulus of max wavelength cycle
AUTOREGRESSIVE LINEAR MODELS AR(1) MODELS The zero-mean AR(1) model x t = x t,1 + t is a linear regression of the current value of the time series on the previous value. For > 0 it generates positively
More informationAsymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands
Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Elizabeth C. Mannshardt-Shamseldin Advisor: Richard L. Smith Duke University Department
More informationProbability and Information Theory. Sargur N. Srihari
Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal
More informationLecture : Probabilistic Machine Learning
Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning
More informationLimit Kriging. Abstract
Limit Kriging V. Roshan Joseph School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, GA 30332-0205, USA roshan@isye.gatech.edu Abstract A new kriging predictor is proposed.
More informationSome general observations.
Modeling and analyzing data from computer experiments. Some general observations. 1. For simplicity, I assume that all factors (inputs) x1, x2,, xd are quantitative. 2. Because the code always produces
More informationOverfitting, Bias / Variance Analysis
Overfitting, Bias / Variance Analysis Professor Ameet Talwalkar Professor Ameet Talwalkar CS260 Machine Learning Algorithms February 8, 207 / 40 Outline Administration 2 Review of last lecture 3 Basic
More informationSTAT 518 Intro Student Presentation
STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible
More informationscrna-seq Differential expression analysis methods Olga Dethlefsen NBIS, National Bioinformatics Infrastructure Sweden October 2017
scrna-seq Differential expression analysis methods Olga Dethlefsen NBIS, National Bioinformatics Infrastructure Sweden October 2017 Olga (NBIS) scrna-seq de October 2017 1 / 34 Outline Introduction: what
More informationSEQUENTIAL ADAPTIVE DESIGNS IN COMPUTER EXPERIMENTS FOR RESPONSE SURFACE MODEL FIT
SEQUENTIAL ADAPTIVE DESIGNS IN COMPUTER EXPERIMENTS FOR RESPONSE SURFACE MODEL FIT DISSERTATION Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate
More informationSequential Importance Sampling for Rare Event Estimation with Computer Experiments
Sequential Importance Sampling for Rare Event Estimation with Computer Experiments Brian Williams and Rick Picard LA-UR-12-22467 Statistical Sciences Group, Los Alamos National Laboratory Abstract Importance
More informationIntroduction to Machine Learning
Introduction to Machine Learning Thomas G. Dietterich tgd@eecs.oregonstate.edu 1 Outline What is Machine Learning? Introduction to Supervised Learning: Linear Methods Overfitting, Regularization, and the
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationDensity Estimation: ML, MAP, Bayesian estimation
Density Estimation: ML, MAP, Bayesian estimation CE-725: Statistical Pattern Recognition Sharif University of Technology Spring 2013 Soleymani Outline Introduction Maximum-Likelihood Estimation Maximum
More informationBetter Simulation Metamodeling: The Why, What and How of Stochastic Kriging
Better Simulation Metamodeling: The Why, What and How of Stochastic Kriging Jeremy Staum Collaborators: Bruce Ankenman, Barry Nelson Evren Baysal, Ming Liu, Wei Xie supported by the NSF under Grant No.
More informationStatistics: Learning models from data
DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial
More information9. Model Selection. statistical models. overview of model selection. information criteria. goodness-of-fit measures
FE661 - Statistical Methods for Financial Engineering 9. Model Selection Jitkomut Songsiri statistical models overview of model selection information criteria goodness-of-fit measures 9-1 Statistical models
More informationBayesian Prediction of Code Output. ASA Albuquerque Chapter Short Course October 2014
Bayesian Prediction of Code Output ASA Albuquerque Chapter Short Course October 2014 Abstract This presentation summarizes Bayesian prediction methodology for the Gaussian process (GP) surrogate representation
More informationBayesian data analysis in practice: Three simple examples
Bayesian data analysis in practice: Three simple examples Martin P. Tingley Introduction These notes cover three examples I presented at Climatea on 5 October 0. Matlab code is available by request to
More informationCOS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION
COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION SEAN GERRISH AND CHONG WANG 1. WAYS OF ORGANIZING MODELS In probabilistic modeling, there are several ways of organizing models:
More informationBayesian Linear Regression
Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective
More informationLecture 3: More on regularization. Bayesian vs maximum likelihood learning
Lecture 3: More on regularization. Bayesian vs maximum likelihood learning L2 and L1 regularization for linear estimators A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting
More informationLecture 4: Types of errors. Bayesian regression models. Logistic regression
Lecture 4: Types of errors. Bayesian regression models. Logistic regression A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting more generally COMP-652 and ECSE-68, Lecture
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More information1 Confidence Intervals and Prediction Intervals for Feed-Forward Neural Networks
Confidence Intervals and Prediction Intervals for Feed-Forward Neural Networks Richard Dybowski King s College London Stephen J. Roberts Imperial College, London Artificial neural networks have been used
More informationBagging During Markov Chain Monte Carlo for Smoother Predictions
Bagging During Markov Chain Monte Carlo for Smoother Predictions Herbert K. H. Lee University of California, Santa Cruz Abstract: Making good predictions from noisy data is a challenging problem. Methods
More informationResponse Surface Methodology IV
LECTURE 8 Response Surface Methodology IV 1. Bias and Variance If y x is the response of the system at the point x, or in short hand, y x = f (x), then we can write η x = E(y x ). This is the true, and
More informationFractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling
Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction
More informationCausal Inference Basics
Causal Inference Basics Sam Lendle October 09, 2013 Observed data, question, counterfactuals Observed data: n i.i.d copies of baseline covariates W, treatment A {0, 1}, and outcome Y. O i = (W i, A i,
More informationIntroduction to Statistical Hypothesis Testing
Introduction to Statistical Hypothesis Testing Arun K. Tangirala Statistics for Hypothesis Testing - Part 1 Arun K. Tangirala, IIT Madras Intro to Statistical Hypothesis Testing 1 Learning objectives I
More informationOutline Introduction OLS Design of experiments Regression. Metamodeling. ME598/494 Lecture. Max Yi Ren
1 / 34 Metamodeling ME598/494 Lecture Max Yi Ren Department of Mechanical Engineering, Arizona State University March 1, 2015 2 / 34 1. preliminaries 1.1 motivation 1.2 ordinary least square 1.3 information
More informationLecture 2: Univariate Time Series
Lecture 2: Univariate Time Series Analysis: Conditional and Unconditional Densities, Stationarity, ARMA Processes Prof. Massimo Guidolin 20192 Financial Econometrics Spring/Winter 2017 Overview Motivation:
More informationStatistics - Lecture One. Outline. Charlotte Wickham 1. Basic ideas about estimation
Statistics - Lecture One Charlotte Wickham wickham@stat.berkeley.edu http://www.stat.berkeley.edu/~wickham/ Outline 1. Basic ideas about estimation 2. Method of Moments 3. Maximum Likelihood 4. Confidence
More informationModelling geoadditive survival data
Modelling geoadditive survival data Thomas Kneib & Ludwig Fahrmeir Department of Statistics, Ludwig-Maximilians-University Munich 1. Leukemia survival data 2. Structured hazard regression 3. Mixed model
More informationOPTIMAL DESIGN INPUTS FOR EXPERIMENTAL CHAPTER 17. Organization of chapter in ISSO. Background. Linear models
CHAPTER 17 Slides for Introduction to Stochastic Search and Optimization (ISSO)by J. C. Spall OPTIMAL DESIGN FOR EXPERIMENTAL INPUTS Organization of chapter in ISSO Background Motivation Finite sample
More informationMonte Carlo Studies. The response in a Monte Carlo study is a random variable.
Monte Carlo Studies The response in a Monte Carlo study is a random variable. The response in a Monte Carlo study has a variance that comes from the variance of the stochastic elements in the data-generating
More informationvariability of the model, represented by σ 2 and not accounted for by Xβ
Posterior Predictive Distribution Suppose we have observed a new set of explanatory variables X and we want to predict the outcomes ỹ using the regression model. Components of uncertainty in p(ỹ y) variability
More informationTheory of Maximum Likelihood Estimation. Konstantin Kashin
Gov 2001 Section 5: Theory of Maximum Likelihood Estimation Konstantin Kashin February 28, 2013 Outline Introduction Likelihood Examples of MLE Variance of MLE Asymptotic Properties What is Statistical
More informationMustafa H. Tongarlak Bruce E. Ankenman Barry L. Nelson
Proceedings of the 0 Winter Simulation Conference S. Jain, R. R. Creasey, J. Himmelspach, K. P. White, and M. Fu, eds. RELATIVE ERROR STOCHASTIC KRIGING Mustafa H. Tongarlak Bruce E. Ankenman Barry L.
More informationStatistics 203: Introduction to Regression and Analysis of Variance Penalized models
Statistics 203: Introduction to Regression and Analysis of Variance Penalized models Jonathan Taylor - p. 1/15 Today s class Bias-Variance tradeoff. Penalized regression. Cross-validation. - p. 2/15 Bias-variance
More informationSTA414/2104. Lecture 11: Gaussian Processes. Department of Statistics
STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations
More informationDimension Reduction. David M. Blei. April 23, 2012
Dimension Reduction David M. Blei April 23, 2012 1 Basic idea Goal: Compute a reduced representation of data from p -dimensional to q-dimensional, where q < p. x 1,...,x p z 1,...,z q (1) We want to do
More informationState-space Model. Eduardo Rossi University of Pavia. November Rossi State-space Model Fin. Econometrics / 53
State-space Model Eduardo Rossi University of Pavia November 2014 Rossi State-space Model Fin. Econometrics - 2014 1 / 53 Outline 1 Motivation 2 Introduction 3 The Kalman filter 4 Forecast errors 5 State
More informatione-companion ONLY AVAILABLE IN ELECTRONIC FORM Electronic Companion Stochastic Kriging for Simulation Metamodeling
OPERATIONS RESEARCH doi 10.187/opre.1090.0754ec e-companion ONLY AVAILABLE IN ELECTRONIC FORM informs 009 INFORMS Electronic Companion Stochastic Kriging for Simulation Metamodeling by Bruce Ankenman,
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationLinear Models for Regression
Linear Models for Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr
More informationBayesian Linear Regression [DRAFT - In Progress]
Bayesian Linear Regression [DRAFT - In Progress] David S. Rosenberg Abstract Here we develop some basics of Bayesian linear regression. Most of the calculations for this document come from the basic theory
More informationModelling Non-linear and Non-stationary Time Series
Modelling Non-linear and Non-stationary Time Series Chapter 2: Non-parametric methods Henrik Madsen Advanced Time Series Analysis September 206 Henrik Madsen (02427 Adv. TS Analysis) Lecture Notes September
More informationTopic 12 Overview of Estimation
Topic 12 Overview of Estimation Classical Statistics 1 / 9 Outline Introduction Parameter Estimation Classical Statistics Densities and Likelihoods 2 / 9 Introduction In the simplest possible terms, the
More informationAssociation studies and regression
Association studies and regression CM226: Machine Learning for Bioinformatics. Fall 2016 Sriram Sankararaman Acknowledgments: Fei Sha, Ameet Talwalkar Association studies and regression 1 / 104 Administration
More informationMachine Learning 4771
Machine Learning 4771 Instructor: Tony Jebara Topic 11 Maximum Likelihood as Bayesian Inference Maximum A Posteriori Bayesian Gaussian Estimation Why Maximum Likelihood? So far, assumed max (log) likelihood
More informationModeling Data with Linear Combinations of Basis Functions. Read Chapter 3 in the text by Bishop
Modeling Data with Linear Combinations of Basis Functions Read Chapter 3 in the text by Bishop A Type of Supervised Learning Problem We want to model data (x 1, t 1 ),..., (x N, t N ), where x i is a vector
More informationEcon 424 Time Series Concepts
Econ 424 Time Series Concepts Eric Zivot January 20 2015 Time Series Processes Stochastic (Random) Process { 1 2 +1 } = { } = sequence of random variables indexed by time Observed time series of length
More informationUncertainty Quantification for Inverse Problems. November 7, 2011
Uncertainty Quantification for Inverse Problems November 7, 2011 Outline UQ and inverse problems Review: least-squares Review: Gaussian Bayesian linear model Parametric reductions for IP Bias, variance
More informationCovariance function estimation in Gaussian process regression
Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian
More informationIntroduction to Gaussian Processes
Introduction to Gaussian Processes Iain Murray murray@cs.toronto.edu CSC255, Introduction to Machine Learning, Fall 28 Dept. Computer Science, University of Toronto The problem Learn scalar function of
More informationIntegrated Likelihood Estimation in Semiparametric Regression Models. Thomas A. Severini Department of Statistics Northwestern University
Integrated Likelihood Estimation in Semiparametric Regression Models Thomas A. Severini Department of Statistics Northwestern University Joint work with Heping He, University of York Introduction Let Y
More informationGAUSSIAN PROCESS REGRESSION
GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The
More informationBootstrap, Jackknife and other resampling methods
Bootstrap, Jackknife and other resampling methods Part III: Parametric Bootstrap Rozenn Dahyot Room 128, Department of Statistics Trinity College Dublin, Ireland dahyot@mee.tcd.ie 2005 R. Dahyot (TCD)
More informationWhy experimenters should not randomize, and what they should do instead
Why experimenters should not randomize, and what they should do instead Maximilian Kasy Department of Economics, Harvard University Maximilian Kasy (Harvard) Experimental design 1 / 42 project STAR Introduction
More informationRegression, Ridge Regression, Lasso
Regression, Ridge Regression, Lasso Fabio G. Cozman - fgcozman@usp.br October 2, 2018 A general definition Regression studies the relationship between a response variable Y and covariates X 1,..., X n.
More informationAnalysing geoadditive regression data: a mixed model approach
Analysing geoadditive regression data: a mixed model approach Institut für Statistik, Ludwig-Maximilians-Universität München Joint work with Ludwig Fahrmeir & Stefan Lang 25.11.2005 Spatio-temporal regression
More informationComputer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)
Prof. Daniel Cremers 2. Regression (cont.) Regression with MLE (Rep.) Assume that y is affected by Gaussian noise : t = f(x, w)+ where Thus, we have p(t x, w, )=N (t; f(x, w), 2 ) 2 Maximum A-Posteriori
More informationIntroduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016
Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An
More informationDATA-ADAPTIVE VARIABLE SELECTION FOR
DATA-ADAPTIVE VARIABLE SELECTION FOR CAUSAL INFERENCE Group Health Research Institute Department of Biostatistics, University of Washington shortreed.s@ghc.org joint work with Ashkan Ertefaie Department
More informationSome Concepts of Probability (Review) Volker Tresp Summer 2018
Some Concepts of Probability (Review) Volker Tresp Summer 2018 1 Definition There are different way to define what a probability stands for Mathematically, the most rigorous definition is based on Kolmogorov
More informationUncertainty quantification and calibration of computer models. February 5th, 2014
Uncertainty quantification and calibration of computer models February 5th, 2014 Physical model Physical model is defined by a set of differential equations. Now, they are usually described by computer
More informationBayesian Approaches Data Mining Selected Technique
Bayesian Approaches Data Mining Selected Technique Henry Xiao xiao@cs.queensu.ca School of Computing Queen s University Henry Xiao CISC 873 Data Mining p. 1/17 Probabilistic Bases Review the fundamentals
More informationCOMS 4771 Regression. Nakul Verma
COMS 4771 Regression Nakul Verma Last time Support Vector Machines Maximum Margin formulation Constrained Optimization Lagrange Duality Theory Convex Optimization SVM dual and Interpretation How get the
More informationIntroduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak
Introduction to Systems Analysis and Decision Making Prepared by: Jakub Tomczak 1 Introduction. Random variables During the course we are interested in reasoning about considered phenomenon. In other words,
More informationNonparametric Bayesian Methods (Gaussian Processes)
[70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent
More informationMultivariate statistical methods and data mining in particle physics
Multivariate statistical methods and data mining in particle physics RHUL Physics www.pp.rhul.ac.uk/~cowan Academic Training Lectures CERN 16 19 June, 2008 1 Outline Statement of the problem Some general
More informationENSC327 Communications Systems 19: Random Processes. Jie Liang School of Engineering Science Simon Fraser University
ENSC327 Communications Systems 19: Random Processes Jie Liang School of Engineering Science Simon Fraser University 1 Outline Random processes Stationary random processes Autocorrelation of random processes
More informationSimulation optimization via bootstrapped Kriging: Survey
Simulation optimization via bootstrapped Kriging: Survey Jack P.C. Kleijnen Department of Information Management / Center for Economic Research (CentER) Tilburg School of Economics & Management (TiSEM)
More informationSequential adaptive designs in computer experiments for response surface model fit
Statistics and Applications Volume 6, Nos. &, 8 (New Series), pp.7-33 Sequential adaptive designs in computer experiments for response surface model fit Chen Quin Lam and William I. Notz Department of
More informationGaussian Processes (10/16/13)
STA561: Probabilistic machine learning Gaussian Processes (10/16/13) Lecturer: Barbara Engelhardt Scribes: Changwei Hu, Di Jin, Mengdi Wang 1 Introduction In supervised learning, we observe some inputs
More informationMIXED EFFECTS MODELS FOR TIME SERIES
Outline MIXED EFFECTS MODELS FOR TIME SERIES Cristina Gorrostieta Hakmook Kang Hernando Ombao Brown University Biostatistics Section February 16, 2011 Outline OUTLINE OF TALK 1 SCIENTIFIC MOTIVATION 2
More informationMachine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.
Bayesian learning: Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function. Let y be the true label and y be the predicted
More informationσ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =
Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,
More informationIntroduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones
Introduction to machine learning and pattern recognition Lecture 2 Coryn Bailer-Jones http://www.mpia.de/homes/calj/mlpr_mpia2008.html 1 1 Last week... supervised and unsupervised methods need adaptive
More informationCPSC 540: Machine Learning
CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is
More informationCheng Soon Ong & Christian Walder. Canberra February June 2018
Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 Outlines Overview Introduction Linear Algebra Probability Linear Regression
More informationMonitoring Wafer Geometric Quality using Additive Gaussian Process
Monitoring Wafer Geometric Quality using Additive Gaussian Process Linmiao Zhang 1 Kaibo Wang 2 Nan Chen 1 1 Department of Industrial and Systems Engineering, National University of Singapore 2 Department
More informationFte β, λ. N ne. ,λ 1. Z Rte R te,tr R 1
02 4 Bayesian Prediction Methodology from Y te ] ] Fte Y ϑ N ne +n s β, λ F Rte R te, R te, R 4..3 where ϑ depends on the case being studied. A second stage specifies the prior disibution of ϑ ]. The following
More informationLinear Regression and Discrimination
Linear Regression and Discrimination Kernel-based Learning Methods Christian Igel Institut für Neuroinformatik Ruhr-Universität Bochum, Germany http://www.neuroinformatik.rub.de July 16, 2009 Christian
More informationStatistical Distribution Assumptions of General Linear Models
Statistical Distribution Assumptions of General Linear Models Applied Multilevel Models for Cross Sectional Data Lecture 4 ICPSR Summer Workshop University of Colorado Boulder Lecture 4: Statistical Distributions
More informationModels for models. Douglas Nychka Geophysical Statistics Project National Center for Atmospheric Research
Models for models Douglas Nychka Geophysical Statistics Project National Center for Atmospheric Research Outline Statistical models and tools Spatial fields (Wavelets) Climate regimes (Regression and clustering)
More informationmlegp: an R package for Gaussian process modeling and sensitivity analysis
mlegp: an R package for Gaussian process modeling and sensitivity analysis Garrett Dancik January 30, 2018 1 mlegp: an overview Gaussian processes (GPs) are commonly used as surrogate statistical models
More informationPhysical Experimental Design in Support of Computer Model Development. Experimental Design for RSM-Within-Model
DAE 2012 1 Physical Experimental Design in Support of Computer Model Development or Experimental Design for RSM-Within-Model Max D. Morris Department of Statistics Dept. of Industrial & Manufacturing Systems
More informationProbability and Estimation. Alan Moses
Probability and Estimation Alan Moses Random variables and probability A random variable is like a variable in algebra (e.g., y=e x ), but where at least part of the variability is taken to be stochastic.
More informationMachine Learning - MT Linear Regression
Machine Learning - MT 2016 2. Linear Regression Varun Kanade University of Oxford October 12, 2016 Announcements All students eligible to take the course for credit can sign-up for classes and practicals
More information12 - Nonparametric Density Estimation
ST 697 Fall 2017 1/49 12 - Nonparametric Density Estimation ST 697 Fall 2017 University of Alabama Density Review ST 697 Fall 2017 2/49 Continuous Random Variables ST 697 Fall 2017 3/49 1.0 0.8 F(x) 0.6
More informationModel Selection for Gaussian Processes
Institute for Adaptive and Neural Computation School of Informatics,, UK December 26 Outline GP basics Model selection: covariance functions and parameterizations Criteria for model selection Marginal
More informationMachine Learning (CS 567) Lecture 5
Machine Learning (CS 567) Lecture 5 Time: T-Th 5:00pm - 6:20pm Location: GFS 118 Instructor: Sofus A. Macskassy (macskass@usc.edu) Office: SAL 216 Office hours: by appointment Teaching assistant: Cheol
More information