Optimal Designs for Gaussian Process Models via Spectral Decomposition. Ofir Harari
|
|
- Jason Tyler
- 6 years ago
- Views:
Transcription
1 Optimal Designs for Gaussian Process Models via Spectral Decomposition Ofir Harari Department of Statistics & Actuarial Sciences, Simon Fraser University September 2014 Dynamic Computer Experiments, /38
2 Outline 1 Introduction Gaussian Process Regression Karhunen-Loève Decomposition of a GP 2 IMSPE-optimal Designs Application of the K-L Decomposition Approximate Minimum IMSPE Designs Adaptive Designs Extending to Models with Unknown Regression Parameters 3 Other Optimality Criteria 4 Concluding Remarks Dynamic Computer Experiments, /38
3 Introduction Outline 1 Introduction Gaussian Process Regression Karhunen-Loève Decomposition of a GP 2 IMSPE-optimal Designs Application of the K-L Decomposition Approximate Minimum IMSPE Designs Adaptive Designs Extending to Models with Unknown Regression Parameters 3 Other Optimality Criteria 4 Concluding Remarks Dynamic Computer Experiments, /38
4 Introduction Gaussian Process Regression The Gaussian Process Model Suppose that we observe y = [y (x 1 ),..., y (x n)] T, where y (x) = f (x) T β + Z (x). Here f (x) is a vector of regression functions evaluated at x and Z (x) is a zero mean stationary Gaussian process with marginal variance σ 2 and correlation function R( ; θ). An extremely popular modeling approach in spatial statistics, geostatistics, computer experiments and more. Sometimes measurement error (i.e. a nugget term) is added. Dynamic Computer Experiments, /38
5 Introduction Gaussian Process Regression The Gaussian Process Model (cont.) Notations: R ij = R(x i x j ) - the correlation matrix at the design D = {x 1,..., x n}. y = [ y(x 1 ),..., y(x ] T n) - the vecor of observatios at D. r(x) = [ R(x x 1 ),..., R(x x ] T n) - the vector of correlations at site x. F ij = f j (x i ) - the design matrix at D. Dynamic Computer Experiments, /38
6 Introduction Gaussian Process Regression The Gaussian Process Model (cont.) Universal Kriging Assuming π (β) 1 ŷ (x) = E { } ) y(x) y = f T (x) β + r T (x) R (y 1 F β, (1) is the minimizer of the mean squared prediction error (MSPE), which will then be [ {ŷ(x) } ] 2 y E y(x) = var { } y(x) y = σ 2 1 [ f T (x), r T (x) ] [ 0 F T F R ] 1 [ f (x) r (x) ]. If the assumption of Gaussianity is dropped, (1) would still be the BLUP for y (x). Dynamic Computer Experiments, /38
7 Introduction Karhunen-Loève Decomposition of a GP Spectral Decomposition of a GP Mercer s Theorem If X is compact and R (the covariance function of a GP) is continuous in X 2, then R(x, y) = λ i ϕ i (x)ϕ i (y), i=1 where the ϕ i s and the λ i s are the solutions of the homogeneuos Fredholm integral equation of the second kind X R(x, y)ϕ i (y)dy = λ i ϕ i (x) and X ϕ i (x)ϕ j (x)dx = δ ij. Dynamic Computer Experiments, /38
8 Introduction Karhunen-Loève Decomposition of a GP Spectral Decomposition of a GP (cont.) The Karhunen-Loève Decomposition Under the aforementioned conditions Z (x) = α i ϕ i (x) (in the MSE sense), i=1 where the α i s are independent N (0, λ i ) random variables. The best approximation (in the MSE sense) of Z (x) is the truncated series in which the eigenvalues are arranged in decreasing order. For piecewise analytic R(x, y), λ k c 1 exp ( c 2 k 1/d) for some constants c 1 and c 2 (Frauenfelder et al. 2005). Numerical solutions involve Galerkin-type methods. Dynamic Computer Experiments, /38
9 Introduction Karhunen-Loève Decomposition of a GP A Worked Example R(x, w) = exp { (x 1 w 1 ) 2 2 (x 2 w 2 ) 2}, logλ k Eigenvalues vs. Index (log scale) k Dynamic Computer Experiments, /38
10 Introduction Karhunen-Loève Decomposition of a GP 0 0 A Worked Example (cont.) Eigenfunction #1, λ = Eigenfunction #2, λ = Eigenfunction #3, λ = Eigenfunction #4, λ = Eigenfunction #5, λ = Eigenfunction #6, λ = Dynamic Computer Experiments, /38
11 y Introduction Karhunen-Loève Decomposition of a GP 0 A Worked Example (cont.) A single realization based on the first 17 K-L terms ( 99.5% of the process energy) X1 X Dynamic Computer Experiments, /38
12 IMSPE-optimal Designs Outline 1 Introduction Gaussian Process Regression Karhunen-Loève Decomposition of a GP 2 IMSPE-optimal Designs Application of the K-L Decomposition Approximate Minimum IMSPE Designs Adaptive Designs Extending to Models with Unknown Regression Parameters 3 Other Optimality Criteria 4 Concluding Remarks Dynamic Computer Experiments, /38
13 IMSPE-optimal Designs Minimum IMSPE Designs Integrated Mean Squared Prediction Error Sacks et al. (1989b) suggested a suitable design for computer experiments should minimize J ( D, ŷ ) [ {ŷ(x) } ] 2 y E y(x) dx = 1 σ 2 σ X 2 [ ] 1 0 F T = vol (X ) tr F R where vol(x ) is the volume of X. X [ f (x) f T (x) r (x) f T (x) f (x) r T (x) r (x) r T (x) ] dx Dynamic Computer Experiments, /38
14 IMSPE-optimal Designs Application of the K-L Decomposition Consider the standard Gaussian process y (x) = f (x) T β + Z (x), where β is (for now) a known vector. The Kriging predictor would now be ŷ (x) = f (x) T β + r T (x) R 1 (y F β). Proposition For any compact X and continuous covariance kernel R, J ( D, ŷ ) = vol(x ) λ 2 kφ T k R 1 φ k, (2) σ 2 where φ p = [ϕ p(x 1 ),..., ϕ p(x n)] T. k=1 Dynamic Computer Experiments, /38
15 IMSPE-optimal Designs Application of the K-L Decomposition (cont.) Theorem 1. For any design D = {x 1,..., x n}, J ( D, ŷ ) σ 2 Z λ k (3) k=n+1 2. The lower bound (3) will be achieved if D satisfies 1 k n λ k φ T k R 1 φ k = 0 k > n 3. If such an ideal D exists, the prediction ŷ (x) at any x X would be identical to the prediction based on the finite dimensional Bayesian linear regression model y (x) = α 1 ϕ 1 (x) + + α nϕ n (x) where α j N ( 0, λ j ) are independent. Dynamic Computer Experiments, /38
16 IMSPE-optimal Designs Application of the K-L Decomposition (cont.) By definition, IMSPE-optimal designs are prediction-oriented. What the Theorem tells us is that in some sense, they are also useful for variable selection (w.r.t the K-L basis functions). In reality, such ideal designs do not exist - but IMSPE-optimal designs get pretty close: λ k φ k T R 1 φ k k 17 k > k Dynamic Computer Experiments, /38
17 IMSPE-optimal Designs Approximate Minimum IMSPE Designs Approximate Minimum IMSPE Designs Approximate IMSPE ( ) J M D, ŷ where σ 2 = vol(x ) M λ 2 kφ T k R 1 φ k = vol(x ) tr { Λ 2 Φ T R 1 Φ }, k=1 Λ = diag (λ 1,..., λ M ) and Φ ij = ϕ j (x i ), 1 i n, 1 j M. Approximate Minimum IMSPE Designs D = argmin D X ( ) J M D, ŷ = argmax tr { Λ 2 Φ T R 1 Φ } D X Dynamic Computer Experiments, /38
18 IMSPE-optimal Designs Approximate Minimum IMSPE Designs A Worked Example (cont.) Size 7, 15 and 25 approximate Minimum IMSPE designs for R(x, w) = exp { (x 1 w 1 ) 2 2 (x 2 w 2 ) 2} Dynamic Computer Experiments, /38
19 IMSPE-optimal Designs Approximate Minimum IMSPE Designs Controlling the Relative Truncation Error Proposition Under some conditions (which are easily met), λ 2 kφ T k R 1 φ k r M = k=m+1 λ 2 j φ T j R 1 φ j k=m+1 j=1 j=1 λ k. λ j By conserving enough of the process energy, we are guaranteed to have a design as close to optimal as we wish. Dynamic Computer Experiments, /38
20 IMSPE-optimal Designs Approximate Minimum IMSPE Designs Controlling the Relative Truncation Error (cont.) Relative truncation error vs. sample size for M = 17 ( 0.492% energy loss): Relative Error Relative Truncation Error vs. Sample Size ε = Ideal True Sample Size Dynamic Computer Experiments, /38
21 IMSPE-optimal Designs Adaptive Designs Adaptive Designs Suppose now that we have the opportunity (or we are forced) to run a multi-stage experiment: n 1 runs at stage 1, n 2 runs at stage 2 and so on. We can use that to learn/re-estimate the essential parameters and plug-in the new estimates, to hopefully improve the design from one stage to another. Such designs are called adaptive or batch-sequential designs. Dynamic Computer Experiments, /38
22 IMSPE-optimal Designs Adaptive Designs Adaptive Designs (cont.) Denote R aug = R new n 2 n 2 R cross n 1 n 2 R cross n 2 n 1 R old n 1 n 1, where R aug, R new, R old and R cross are the augmented correlation matrix, the matrix of correlations within the new inputs, the matrix of correlations within the original design and the cross correlation matrix, and Φ aug = Φ new n 2 M Φ old n 1 M. Parameters may be re-estimated in between batches. Dynamic Computer Experiments, /38
23 IMSPE-optimal Designs Adaptive Designs Adaptive Designs (cont.) Using basic block matrix properties, we look to maximize tr { Λ 2 Φ T augr 1 augφ aug } = [ = tr Λ {( ) 2 Φ T new Φ T oldr 1 old RT cross Q 1 1 Φnew + ( ) Φ T old Φ T newr 1 newr cross Q 1 2 Φ old} ], where Q 1 = R new R T crossr 1 old Rcross and Q 2 = R old R crossr 1 newr T cross, and R old and Φ old remain unchanged. Dynamic Computer Experiments, /38
24 IMSPE-optimal Designs Adaptive Designs Adaptive Designs (cont.) Adding 7 New Runs to an Existing 8 Run Design (λ = 0.1): Size 8+7 design, IMSPE= X st Batch 2nd Batch X 1 Dynamic Computer Experiments, /38
25 IMSPE-optimal Designs Adaptive Designs Adaptive Designs (cont.) Adding 7 New Runs to an Existing 8 Run Design (λ = 0.1): Size 8+7 design, IMSPE= X st Batch 2nd Batch X 1 Dynamic Computer Experiments, /38
26 IMSPE-optimal Designs Extending to Models with Unknown Regression Parameters Approximate IMSPE Criterion for the Universal Kriging Model Expanding the covariance kernel, using Mercer s Theorem, and truncating after the first M terms yields H M ( D, ŷ ) = JM ( D, ŷ ) { (F +tr T R 1 F ) [ ]} 1 f (x) f T (x) dx 2A T φ (x) f T (x) dx + A T A X X (4), where JM is the criterion previously derived for the simple Kriging model, A = ΛΦ T R 1 F and φ (x) = [ϕ 1 (x),..., ϕ M (x)] T. Dynamic Computer Experiments, /38
27 IMSPE-optimal Designs Extending to Models with Unknown Regression Parameters Approximate IMSPE Criterion for the Universal Kriging Model Expression (4) is somewhat simplified by substituting f (x) = 1 and F = 1 n to where H M ( D, ŷ ) = JM ( D, ŷ ) T nr 1 1 n { vol (X ) 2a T γ + a T a }, a = ΛΦ T R 1 1 n and γ = [ X ϕ 1 (x)dx,..., The vector of integrals γ only needs to be evaluated once. X ϕ M (x)dx] T. Almost identical designs to those obtained for the simple Kriging model, for a reasonably large n. Dynamic Computer Experiments, /38
28 IMSPE-optimal Designs Extending to Models with Unknown Regression Parameters With and Without Estimating the Intercept X X X X 1 Dynamic Computer Experiments, /38
29 IMSPE-optimal Designs Extending to Models with Unknown Regression Parameters With and Without Estimating the Intercept (cont.) Deviation from the optimum vs. sample size n Dynamic Computer Experiments, /38
30 Other Optimality Criteria Outline 1 Introduction Gaussian Process Regression Karhunen-Loève Decomposition of a GP 2 IMSPE-optimal Designs Application of the K-L Decomposition Approximate Minimum IMSPE Designs Adaptive Designs Extending to Models with Unknown Regression Parameters 3 Other Optimality Criteria 4 Concluding Remarks Dynamic Computer Experiments, /38
31 Other Optimality Criteria Bayesian A-optimal Designs { )} In classical DOE, a design is called A-optimal if it minimizes tr var ( θ, where θ is the vector of estimators for the model parameters. For our model y (x) = f (x) T β + Z (x) = f (x) T β + α k ϕ k (x), α k N (0, λ i ), k=1 we may consider the Bayesian A-optimality criterion Q ( D, ŷ ) = var { } α k y. k=1 When β is a known vector, the IMSPE and the A-optimality criteria coincide. Dynamic Computer Experiments, /38
32 Other Optimality Criteria Bayesian A-optimal Designs (cont.) Proposition If π (β) 1, Q ( D, ŷ ) = vol (X ) tr 0 F F T R 1 X r (x) r T (x) dx. (and hence we don t even need to derive the Mercer expansion) IMSPE optimal A optimal X X X X 1 Dynamic Computer Experiments, /38
33 Other Optimality Criteria Bayesian D-optimal Designs In classical DOE, a design is called D-optimal if it minimizes the determinant of var ( θ) (typically the inverse Fisher information). In our framework, an approximate Bayesian D-optimal design would be D = argmin det ( var { }) α 1,..., α M y D Where var { α 1,..., α M y } = Λ ΛΦ T [R 1 R 1 F ( F T R 1 F ) 1 F T R 1 ] ΦΛ. D optimal IMSPE optimal A optimal X X X X X X1 Dynamic Computer Experiments, /38
34 Concluding Remarks Outline 1 Introduction Gaussian Process Regression Karhunen-Loève Decomposition of a GP 2 IMSPE-optimal Designs Application of the K-L Decomposition Approximate Minimum IMSPE Designs Adaptive Designs Extending to Models with Unknown Regression Parameters 3 Other Optimality Criteria 4 Concluding Remarks Dynamic Computer Experiments, /38
35 Concluding Remarks Few comments before we go If Z (x) is a zero mean non-gaussian, weakly-stationary process with the same correlation structure - the Kriging predictor will no longer be the posterior mean, but will still be the BLUP. - The MSPE will no longer be the posterior variance, but all the results still hold. Inclusion of measurement error (i.e. Nugget Term) y (x) = f (x) T β + Z (x) + ε (x), ε N ( 0, τ 2 I ), λ = τ 2 /σ 2 will simply result in replacing R with R + λi everywhere, but the theory will cease to apply. - Great computational benefits for small values of λ. Dynamic Computer Experiments, /38
36 Concluding Remarks Thank you! Dynamic Computer Experiments, /38
37 References References I 1. Andrianakis, Ioannis, & Challenor, Peter G The effect of the nugget on Gaussian process emulators of computer models. Computational Statistics and Data Analysis, 56(12), Atkinson, A.C., Doney, A.N., & Tobias, R.D Optimum Experimental Designs, with SAS. Oxford Statistical Science Series. 3. Bursztyn, D., & Steinberg, D. M Data analytic tools for understanding random field regression models. Technometrics, 46(4), Cressie, Noel A. C Statistics for Spatial Data. Wiley-Interscience. 5. Frauenfelder, Philipp, Schwab, Christoph, & Todor, Radu Alexandru Finite Elements for Elliptic Problems with Stochastic Coefficients. Computer Methods in Applied Mechanics and Engineering, 194(2), Guenther, Ronald B., & Lee, John W Partial Differential Equations of Mathematical Physics and Integral Equations. Courier Dover Publications. 7. Jones, Brad, & Johnson, Rachel T The Design and Analysis of the Gaussian Process Model. Quality and Reliability Engineering International, 25(5), Journel, A., & Rossi, M When Do We Need a Trend Model in Kriging? Mathematical Geology, 21(7), Dynamic Computer Experiments, /38
38 References References II 9. Linkletter, Crystal, Bingham, Derek, Hengartner, Nicholas, Higdon, David, & Ye, Kenny Q Variable Selection for Gaussian Process Models in Computer Experiments. Technometrics, 48(4), Loeppky, Jason L., Sacks, Jerome, & Welch, William J Choosing the Sample Size of a Computer Experiment: A Practical Guide. Technometrics, 51(4), Loeppky, Jason L., Moore, Leslie M., & Williams, Brian J Batch Sequential Designs for Computer Experiments. Journal of Statistical Planning and Inference, 140(6), Picheny, Victor, Ginsbourger, David, Roustant, Olivier, Haftka, Raphael T., & Kim, Nam H Adaptive Designs of Experiments for Accurate Approximation of a Target Region. Journal of Mechanical Design, 132(7), Rasmussen, Carl Edward, & Williams, Chris Gaussian Processes for Machine Learning. The MIT Press. 14. Sacks, Jerome, Welch, William J., Mitchell, Toby J., & Wynn, Henry P. 1989a. Design and Analysis of Computer Experiments. Statistical Science, 4(4), Sacks, Jerome, Schiller, Susannah B., & Welch, William J. 1989b. Designs for Computer Experiments. Technometrics, 31(1), Dynamic Computer Experiments, /38
39 References References III 16. Santner, Thomas J., Williams, Brian J., & Notz, William I The Design and Analysis of Computer Experiments. Springer. 17. Shewry, Michael C., & Wynn, Henry P Maximum Entropy Sampling. Journal of Applied Statistics, 14(2), Singhee, Amith, Singhal, Sonia, & Rutenbar, Rob A Exploiting Correlation Kernels for Efficient Handling of Intra-Die Spatial Correlation, with Application to Statistical Timing. Pages of: Proceedings of the conference on Design, Automation and Test in Europe. 19. Wahba, Grace Spline Models for Observational Data. Society for Industrial and Applied Mathematics. Dynamic Computer Experiments, /38
arxiv: v1 [stat.me] 31 Dec 2014
Variable Selection in Bayesian Semiparametric Regression Models Ofir Harari a, David M. Steinberg a a Department of Statistics and Operations Research, Raymond and Beverly Sackler Faculty of Exact Sciences,
More informationIntroduction to emulators - the what, the when, the why
School of Earth and Environment INSTITUTE FOR CLIMATE & ATMOSPHERIC SCIENCE Introduction to emulators - the what, the when, the why Dr Lindsay Lee 1 What is a simulator? A simulator is a computer code
More informationÉtude de classes de noyaux adaptées à la simplification et à l interprétation des modèles d approximation.
Étude de classes de noyaux adaptées à la simplification et à l interprétation des modèles d approximation. Soutenance de thèse de N. Durrande Ecole des Mines de St-Etienne, 9 november 2011 Directeurs Laurent
More informationarxiv: v1 [stat.me] 24 May 2010
The role of the nugget term in the Gaussian process method Andrey Pepelyshev arxiv:1005.4385v1 [stat.me] 24 May 2010 Abstract The maximum likelihood estimate of the correlation parameter of a Gaussian
More informationLimit Kriging. Abstract
Limit Kriging V. Roshan Joseph School of Industrial and Systems Engineering Georgia Institute of Technology Atlanta, GA 30332-0205, USA roshan@isye.gatech.edu Abstract A new kriging predictor is proposed.
More informationMonitoring Wafer Geometric Quality using Additive Gaussian Process
Monitoring Wafer Geometric Quality using Additive Gaussian Process Linmiao Zhang 1 Kaibo Wang 2 Nan Chen 1 1 Department of Industrial and Systems Engineering, National University of Singapore 2 Department
More informationEcon 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines
Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the
More informationGaussian Processes for Machine Learning
Gaussian Processes for Machine Learning Carl Edward Rasmussen Max Planck Institute for Biological Cybernetics Tübingen, Germany carl@tuebingen.mpg.de Carlos III, Madrid, May 2006 The actual science of
More informationMATH 590: Meshfree Methods
MATH 590: Meshfree Methods Chapter 4: The Connection to Kriging Greg Fasshauer Department of Applied Mathematics Illinois Institute of Technology Fall 2014 fasshauer@iit.edu MATH 590 Chapter 4 1 Outline
More informationGaussian Process Regression
Gaussian Process Regression 4F1 Pattern Recognition, 21 Carl Edward Rasmussen Department of Engineering, University of Cambridge November 11th - 16th, 21 Rasmussen (Engineering, Cambridge) Gaussian Process
More informationKriging by Example: Regression of oceanographic data. Paris Perdikaris. Brown University, Division of Applied Mathematics
Kriging by Example: Regression of oceanographic data Paris Perdikaris Brown University, Division of Applied Mathematics! January, 0 Sea Grant College Program Massachusetts Institute of Technology Cambridge,
More informationSTAT 518 Intro Student Presentation
STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible
More informationTilburg University. Efficient Global Optimization for Black-Box Simulation via Sequential Intrinsic Kriging Mehdad, Ehsan; Kleijnen, Jack
Tilburg University Efficient Global Optimization for Black-Box Simulation via Sequential Intrinsic Kriging Mehdad, Ehsan; Kleijnen, Jack Document version: Early version, also known as pre-print Publication
More informationKarhunen-Loève Approximation of Random Fields Using Hierarchical Matrix Techniques
Institut für Numerische Mathematik und Optimierung Karhunen-Loève Approximation of Random Fields Using Hierarchical Matrix Techniques Oliver Ernst Computational Methods with Applications Harrachov, CR,
More informationCSci 8980: Advanced Topics in Graphical Models Gaussian Processes
CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee November 15, 2007 Gaussian Processes Outline Gaussian Processes Outline Parametric Bayesian Regression Gaussian
More informationNeutron inverse kinetics via Gaussian Processes
Neutron inverse kinetics via Gaussian Processes P. Picca Politecnico di Torino, Torino, Italy R. Furfaro University of Arizona, Tucson, Arizona Outline Introduction Review of inverse kinetics techniques
More informationCovariance function estimation in Gaussian process regression
Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian
More informationOptimal Design for Prediction in Random Field Models via Covariance Kernel Expansions
Optimal Design for Prediction in Random Field Models via Covariance Kernel Expansions Bertrand Gauthier, Luc Pronzato To cite this version: Bertrand Gauthier, Luc Pronzato. Optimal Design for Prediction
More informationSequential adaptive designs in computer experiments for response surface model fit
Statistics and Applications Volume 6, Nos. &, 8 (New Series), pp.7-33 Sequential adaptive designs in computer experiments for response surface model fit Chen Quin Lam and William I. Notz Department of
More informationStatistics & Data Sciences: First Year Prelim Exam May 2018
Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book
More informationKarhunen-Loeve Expansion and Optimal Low-Rank Model for Spatial Processes
TTU, October 26, 2012 p. 1/3 Karhunen-Loeve Expansion and Optimal Low-Rank Model for Spatial Processes Hao Zhang Department of Statistics Department of Forestry and Natural Resources Purdue University
More informationCh 4. Linear Models for Classification
Ch 4. Linear Models for Classification Pattern Recognition and Machine Learning, C. M. Bishop, 2006. Department of Computer Science and Engineering Pohang University of Science and echnology 77 Cheongam-ro,
More informationNonparameteric Regression:
Nonparameteric Regression: Nadaraya-Watson Kernel Regression & Gaussian Process Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro,
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationNotes Regarding the Karhunen Loève Expansion
Notes Regarding the Karhunen Loève Expansion 1 Properties of the Karhunen Loève Expansion As noted in Section 5.3 of [3], the Karhunen Loève expansion for a correlated second-order random field α(x, ω),
More informationLecture 5: GPs and Streaming regression
Lecture 5: GPs and Streaming regression Gaussian Processes Information gain Confidence intervals COMP-652 and ECSE-608, Lecture 5 - September 19, 2017 1 Recall: Non-parametric regression Input space X
More informationSpatial smoothing using Gaussian processes
Spatial smoothing using Gaussian processes Chris Paciorek paciorek@hsph.harvard.edu August 5, 2004 1 OUTLINE Spatial smoothing and Gaussian processes Covariance modelling Nonstationary covariance modelling
More informationMachine Learning 2017
Machine Learning 2017 Volker Roth Department of Mathematics & Computer Science University of Basel 21st March 2017 Volker Roth (University of Basel) Machine Learning 2017 21st March 2017 1 / 41 Section
More informationComputer Model Calibration or Tuning in Practice
Computer Model Calibration or Tuning in Practice Jason L. Loeppky Department of Statistics University of British Columbia Vancouver, BC, V6T 1Z2, CANADA (jason@stat.ubc.ca) Derek Bingham Department of
More informationSequential approaches to reliability estimation and optimization based on kriging
Sequential approaches to reliability estimation and optimization based on kriging Rodolphe Le Riche1,2 and Olivier Roustant2 1 CNRS ; 2 Ecole des Mines de Saint-Etienne JSO 2012, ONERA Palaiseau 1 Intro
More informationNonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University
Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University this presentation derived from that presented at the Pan-American Advanced
More informationWhy experimenters should not randomize, and what they should do instead
Why experimenters should not randomize, and what they should do instead Maximilian Kasy Department of Economics, Harvard University Maximilian Kasy (Harvard) Experimental design 1 / 42 project STAR Introduction
More informationPILCO: A Model-Based and Data-Efficient Approach to Policy Search
PILCO: A Model-Based and Data-Efficient Approach to Policy Search (M.P. Deisenroth and C.E. Rasmussen) CSC2541 November 4, 2016 PILCO Graphical Model PILCO Probabilistic Inference for Learning COntrol
More informationA Process over all Stationary Covariance Kernels
A Process over all Stationary Covariance Kernels Andrew Gordon Wilson June 9, 0 Abstract I define a process over all stationary covariance kernels. I show how one might be able to perform inference that
More informationLecture 3. Linear Regression II Bastian Leibe RWTH Aachen
Advanced Machine Learning Lecture 3 Linear Regression II 02.11.2015 Bastian Leibe RWTH Aachen http://www.vision.rwth-aachen.de/ leibe@vision.rwth-aachen.de This Lecture: Advanced Machine Learning Regression
More informationAn introduction to Bayesian statistics and model calibration and a host of related topics
An introduction to Bayesian statistics and model calibration and a host of related topics Derek Bingham Statistics and Actuarial Science Simon Fraser University Cast of thousands have participated in the
More informationHilbert Space Methods for Reduced-Rank Gaussian Process Regression
Hilbert Space Methods for Reduced-Rank Gaussian Process Regression Arno Solin and Simo Särkkä Aalto University, Finland Workshop on Gaussian Process Approximation Copenhagen, Denmark, May 2015 Solin &
More informationComputer experiments with functional inputs and scalar outputs by a norm-based approach
Computer experiments with functional inputs and scalar outputs by a norm-based approach arxiv:1410.0403v1 [stat.me] 1 Oct 2014 Thomas Muehlenstaedt W. L. Gore & Associates and Jana Fruth Faculty of Statistics,
More informationGaussian Processes for Regression. Carl Edward Rasmussen. Department of Computer Science. Toronto, ONT, M5S 1A4, Canada.
In Advances in Neural Information Processing Systems 8 eds. D. S. Touretzky, M. C. Mozer, M. E. Hasselmo, MIT Press, 1996. Gaussian Processes for Regression Christopher K. I. Williams Neural Computing
More informationMachine Learning. Bayesian Regression & Classification. Marc Toussaint U Stuttgart
Machine Learning Bayesian Regression & Classification learning as inference, Bayesian Kernel Ridge regression & Gaussian Processes, Bayesian Kernel Logistic Regression & GP classification, Bayesian Neural
More informationLinear Models for Regression
Linear Models for Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr
More informationIntroduction to Gaussian-process based Kriging models for metamodeling and validation of computer codes
Introduction to Gaussian-process based Kriging models for metamodeling and validation of computer codes François Bachoc Department of Statistics and Operations Research, University of Vienna (Former PhD
More informationLast updated: Oct 22, 2012 LINEAR CLASSIFIERS. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition
Last updated: Oct 22, 2012 LINEAR CLASSIFIERS Problems 2 Please do Problem 8.3 in the textbook. We will discuss this in class. Classification: Problem Statement 3 In regression, we are modeling the relationship
More informationStatistical Techniques in Robotics (16-831, F12) Lecture#20 (Monday November 12) Gaussian Processes
Statistical Techniques in Robotics (6-83, F) Lecture# (Monday November ) Gaussian Processes Lecturer: Drew Bagnell Scribe: Venkatraman Narayanan Applications of Gaussian Processes (a) Inverse Kinematics
More informationSensitivity analysis in linear and nonlinear models: A review. Introduction
Sensitivity analysis in linear and nonlinear models: A review Caren Marzban Applied Physics Lab. and Department of Statistics Univ. of Washington, Seattle, WA, USA 98195 Consider: Introduction Question:
More informationProbabilistic numerics for deep learning
Presenter: Shijia Wang Department of Engineering Science, University of Oxford rning (RLSS) Summer School, Montreal 2017 Outline 1 Introduction Probabilistic Numerics 2 Components Probabilistic modeling
More informationReduction of Model Complexity and the Treatment of Discrete Inputs in Computer Model Emulation
Reduction of Model Complexity and the Treatment of Discrete Inputs in Computer Model Emulation Curtis B. Storlie a a Los Alamos National Laboratory E-mail:storlie@lanl.gov Outline Reduction of Emulator
More information3. Some tools for the analysis of sequential strategies based on a Gaussian process prior
3. Some tools for the analysis of sequential strategies based on a Gaussian process prior E. Vazquez Computer experiments June 21-22, 2010, Paris 21 / 34 Function approximation with a Gaussian prior Aim:
More informationA Vector-Space Approach for Stochastic Finite Element Analysis
A Vector-Space Approach for Stochastic Finite Element Analysis S Adhikari 1 1 Swansea University, UK CST2010: Valencia, Spain Adhikari (Swansea) Vector-Space Approach for SFEM 14-17 September, 2010 1 /
More informationModel Selection for Gaussian Processes
Institute for Adaptive and Neural Computation School of Informatics,, UK December 26 Outline GP basics Model selection: covariance functions and parameterizations Criteria for model selection Marginal
More informationKernel-based Approximation. Methods using MATLAB. Gregory Fasshauer. Interdisciplinary Mathematical Sciences. Michael McCourt.
SINGAPORE SHANGHAI Vol TAIPEI - Interdisciplinary Mathematical Sciences 19 Kernel-based Approximation Methods using MATLAB Gregory Fasshauer Illinois Institute of Technology, USA Michael McCourt University
More informationStatistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes
Statistical Techniques in Robotics (16-831, F12) Lecture#21 (Monday November 12) Gaussian Processes Lecturer: Drew Bagnell Scribe: Venkatraman Narayanan 1, M. Koval and P. Parashar 1 Applications of Gaussian
More informationUse of Design Sensitivity Information in Response Surface and Kriging Metamodels
Optimization and Engineering, 2, 469 484, 2001 c 2002 Kluwer Academic Publishers. Manufactured in The Netherlands. Use of Design Sensitivity Information in Response Surface and Kriging Metamodels J. J.
More informationComputer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)
Prof. Daniel Cremers 2. Regression (cont.) Regression with MLE (Rep.) Assume that y is affected by Gaussian noise : t = f(x, w)+ where Thus, we have p(t x, w, )=N (t; f(x, w), 2 ) 2 Maximum A-Posteriori
More informationGaussian Processes for Computer Experiments
Gaussian Processes for Computer Experiments Jeremy Oakley School of Mathematics and Statistics, University of Sheffield www.jeremy-oakley.staff.shef.ac.uk 1 / 43 Computer models Computer model represented
More informationAdvanced Introduction to Machine Learning CMU-10715
Advanced Introduction to Machine Learning CMU-10715 Gaussian Processes Barnabás Póczos http://www.gaussianprocess.org/ 2 Some of these slides in the intro are taken from D. Lizotte, R. Parr, C. Guesterin
More informationSINGLE NUGGET KRIGING. Minyong R. Lee Art B. Owen. Department of Statistics STANFORD UNIVERSITY Stanford, California
SINGLE NUGGET KRIGING By Minyong R. Lee Art B. Owen Technical Report No. 2015-21 November 2015 Department of Statistics STANFORD UNIVERSITY Stanford, California 94305-4065 SINGLE NUGGET KRIGING By Minyong
More informationCOMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017
COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University FEATURE EXPANSIONS FEATURE EXPANSIONS
More informationKarhunen-Loève decomposition of Gaussian measures on Banach spaces
Karhunen-Loève decomposition of Gaussian measures on Banach spaces Jean-Charles Croix jean-charles.croix@emse.fr Génie Mathématique et Industriel (GMI) First workshop on Gaussian processes at Saint-Etienne
More informationParallel Gaussian Process Optimization with Upper Confidence Bound and Pure Exploration
Parallel Gaussian Process Optimization with Upper Confidence Bound and Pure Exploration Emile Contal David Buffoni Alexandre Robicquet Nicolas Vayatis CMLA, ENS Cachan, France September 25, 2013 Motivating
More informationThis article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and
This article appeared in a journal published by Elsevier The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution
More informationUnsupervised Learning Techniques Class 07, 1 March 2006 Andrea Caponnetto
Unsupervised Learning Techniques 9.520 Class 07, 1 March 2006 Andrea Caponnetto About this class Goal To introduce some methods for unsupervised learning: Gaussian Mixtures, K-Means, ISOMAP, HLLE, Laplacian
More informationFollow-Up Experimental Designs for Computer Models and Physical Processes
This article was downloaded by: [Acadia University] On: 13 June 2012, At: 07:19 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer
More informationNew Developments in LS-OPT - Robustness Studies
5. LS-DYNA - Anwenderforum, Ulm 2006 Robustheit / Optimierung I New Developments in LS-OPT - Robustness Studies Ken Craig, Willem Roux 2, Nielen Stander 2 Multidisciplinary Design Optimization Group University
More informationCross Validation and Maximum Likelihood estimations of hyper-parameters of Gaussian processes with model misspecification
Cross Validation and Maximum Likelihood estimations of hyper-parameters of Gaussian processes with model misspecification François Bachoc Josselin Garnier Jean-Marc Martinez CEA-Saclay, DEN, DM2S, STMF,
More informationA Magiv CV Theory for Large-Margin Classifiers
A Magiv CV Theory for Large-Margin Classifiers Hui Zou School of Statistics, University of Minnesota June 30, 2018 Joint work with Boxiang Wang Outline 1 Background 2 Magic CV formula 3 Magic support vector
More informationMustafa H. Tongarlak Bruce E. Ankenman Barry L. Nelson
Proceedings of the 0 Winter Simulation Conference S. Jain, R. R. Creasey, J. Himmelspach, K. P. White, and M. Fu, eds. RELATIVE ERROR STOCHASTIC KRIGING Mustafa H. Tongarlak Bruce E. Ankenman Barry L.
More informationIntroduction to Gaussian Processes
Introduction to Gaussian Processes 1 Objectives to express prior knowledge/beliefs about model outputs using Gaussian process (GP) to sample functions from the probability measure defined by GP to build
More informationLinear Models for Regression
Linear Models for Regression Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr
More informationChapter 4 - Fundamentals of spatial processes Lecture notes
TK4150 - Intro 1 Chapter 4 - Fundamentals of spatial processes Lecture notes Odd Kolbjørnsen and Geir Storvik January 30, 2017 STK4150 - Intro 2 Spatial processes Typically correlation between nearby sites
More informationCS 195-5: Machine Learning Problem Set 1
CS 95-5: Machine Learning Problem Set Douglas Lanman dlanman@brown.edu 7 September Regression Problem Show that the prediction errors y f(x; ŵ) are necessarily uncorrelated with any linear function of
More informationSTA414/2104. Lecture 11: Gaussian Processes. Department of Statistics
STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations
More informationKarhunen-Loève decomposition of Gaussian measures on Banach spaces
Karhunen-Loève decomposition of Gaussian measures on Banach spaces Jean-Charles Croix GT APSSE - April 2017, the 13th joint work with Xavier Bay. 1 / 29 Sommaire 1 Preliminaries on Gaussian processes 2
More informationDesign of Screening Experiments with Partial Replication
Design of Screening Experiments with Partial Replication David J. Edwards Department of Statistical Sciences & Operations Research Virginia Commonwealth University Robert D. Leonard Department of Information
More informationThe Expectation Maximization or EM algorithm
The Expectation Maximization or EM algorithm Carl Edward Rasmussen November 15th, 2017 Carl Edward Rasmussen The EM algorithm November 15th, 2017 1 / 11 Contents notation, objective the lower bound functional,
More informationGaussian Process Regression: Active Data Selection and Test Point. Rejection. Sambu Seo Marko Wallat Thore Graepel Klaus Obermayer
Gaussian Process Regression: Active Data Selection and Test Point Rejection Sambu Seo Marko Wallat Thore Graepel Klaus Obermayer Department of Computer Science, Technical University of Berlin Franklinstr.8,
More informationVariational Model Selection for Sparse Gaussian Process Regression
Variational Model Selection for Sparse Gaussian Process Regression Michalis K. Titsias School of Computer Science University of Manchester 7 September 2008 Outline Gaussian process regression and sparse
More informationInversion Base Height. Daggot Pressure Gradient Visibility (miles)
Stanford University June 2, 1998 Bayesian Backtting: 1 Bayesian Backtting Trevor Hastie Stanford University Rob Tibshirani University of Toronto Email: trevor@stat.stanford.edu Ftp: stat.stanford.edu:
More informationGAUSSIAN PROCESS REGRESSION
GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The
More informationNonparmeteric Bayes & Gaussian Processes. Baback Moghaddam Machine Learning Group
Nonparmeteric Bayes & Gaussian Processes Baback Moghaddam baback@jpl.nasa.gov Machine Learning Group Outline Bayesian Inference Hierarchical Models Model Selection Parametric vs. Nonparametric Gaussian
More informationMonte Carlo Methods for Uncertainty Quantification
Monte Carlo Methods for Uncertainty Quantification Mike Giles Mathematical Institute, University of Oxford Contemporary Numerical Techniques Mike Giles (Oxford) Monte Carlo methods 1 / 23 Lecture outline
More informationDesign of Computer Experiments for Optimization, Estimation of Function Contours, and Related Objectives
7 Design of Computer Experiments for Optimization, Estimation of Function Contours, and Related Objectives Derek Bingham Simon Fraser University, Burnaby, BC Pritam Ranjan Acadia University, Wolfville,
More informationModelling geoadditive survival data
Modelling geoadditive survival data Thomas Kneib & Ludwig Fahrmeir Department of Statistics, Ludwig-Maximilians-University Munich 1. Leukemia survival data 2. Structured hazard regression 3. Mixed model
More informationGaussian Processes. 1 What problems can be solved by Gaussian Processes?
Statistical Techniques in Robotics (16-831, F1) Lecture#19 (Wednesday November 16) Gaussian Processes Lecturer: Drew Bagnell Scribe:Yamuna Krishnamurthy 1 1 What problems can be solved by Gaussian Processes?
More informationStochastic Spectral Approaches to Bayesian Inference
Stochastic Spectral Approaches to Bayesian Inference Prof. Nathan L. Gibson Department of Mathematics Applied Mathematics and Computation Seminar March 4, 2011 Prof. Gibson (OSU) Spectral Approaches to
More informationIntroduction to Gaussian Processes
Introduction to Gaussian Processes Neil D. Lawrence GPSS 10th June 2013 Book Rasmussen and Williams (2006) Outline The Gaussian Density Covariance from Basis Functions Basis Function Representations Constructing
More informationLinear Regression (9/11/13)
STA561: Probabilistic machine learning Linear Regression (9/11/13) Lecturer: Barbara Engelhardt Scribes: Zachary Abzug, Mike Gloudemans, Zhuosheng Gu, Zhao Song 1 Why use linear regression? Figure 1: Scatter
More informationGaussian Process Regression and Emulation
Gaussian Process Regression and Emulation STAT8810, Fall 2017 M.T. Pratola September 22, 2017 Today Experimental Design; Sensitivity Analysis Designing Your Experiment If you will run a simulator model,
More informationExam: high-dimensional data analysis January 20, 2014
Exam: high-dimensional data analysis January 20, 204 Instructions: - Write clearly. Scribbles will not be deciphered. - Answer each main question not the subquestions on a separate piece of paper. - Finish
More informationSEQUENTIAL DESIGN OF COMPUTER EXPERIMENTS TO MINIMIZE INTEGRATED RESPONSE FUNCTIONS
Statistica Sinica 10(2000), 1133-1152 SEQUENTIAL DESIGN OF COMPUTER EXPERIMENTS TO MINIMIZE INTEGRATED RESPONSE FUNCTIONS Brian J. Williams, Thomas J. Santner and William I. Notz The Ohio State University
More informationKriging models with Gaussian processes - covariance function estimation and impact of spatial sampling
Kriging models with Gaussian processes - covariance function estimation and impact of spatial sampling François Bachoc former PhD advisor: Josselin Garnier former CEA advisor: Jean-Marc Martinez Department
More informationBayesian Dynamic Linear Modelling for. Complex Computer Models
Bayesian Dynamic Linear Modelling for Complex Computer Models Fei Liu, Liang Zhang, Mike West Abstract Computer models may have functional outputs. With no loss of generality, we assume that a single computer
More informationBayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes
Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes Andrew O. Finley 1 and Sudipto Banerjee 2 1 Department of Forestry & Department of Geography, Michigan
More informationSEQUENTIAL ADAPTIVE DESIGNS IN COMPUTER EXPERIMENTS FOR RESPONSE SURFACE MODEL FIT
SEQUENTIAL ADAPTIVE DESIGNS IN COMPUTER EXPERIMENTS FOR RESPONSE SURFACE MODEL FIT DISSERTATION Presented in Partial Fulfillment of the Requirements for the Degree Doctor of Philosophy in the Graduate
More informationPhysics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester
Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability
More informationProbabilistic and Bayesian Machine Learning
Probabilistic and Bayesian Machine Learning Day 4: Expectation and Belief Propagation Yee Whye Teh ywteh@gatsby.ucl.ac.uk Gatsby Computational Neuroscience Unit University College London http://www.gatsby.ucl.ac.uk/
More informationLecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods.
Lecture 5: Linear models for classification. Logistic regression. Gradient Descent. Second-order methods. Linear models for classification Logistic regression Gradient descent and second-order methods
More informationMultivariate Bayesian Linear Regression MLAI Lecture 11
Multivariate Bayesian Linear Regression MLAI Lecture 11 Neil D. Lawrence Department of Computer Science Sheffield University 21st October 2012 Outline Univariate Bayesian Linear Regression Multivariate
More informationTutorial on Gaussian Processes and the Gaussian Process Latent Variable Model
Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model (& discussion on the GPLVM tech. report by Prof. N. Lawrence, 06) Andreas Damianou Department of Neuro- and Computer Science,
More information