Étude de classes de noyaux adaptées à la simplification et à l interprétation des modèles d approximation.
|
|
- Allison Miles
- 5 years ago
- Views:
Transcription
1 Étude de classes de noyaux adaptées à la simplification et à l interprétation des modèles d approximation. Soutenance de thèse de N. Durrande Ecole des Mines de St-Etienne, 9 november 2011 Directeurs Laurent Carraro Rapporteurs Beatrice Laurent Rodolphe Le Riche Henry Wynn Co-encadrants David Ginsbourger Examinateurs Yves Grandvalet Olivier Roustant Alberto Pasanisi 1/37
2 Introduction to Gaussian Process models Outline: 1 Introduction: The Gaussian process modeling framework issues when d is large: lack of interpretability large number of observation 2 Simplified models for high dimensional modeling 3 Gaussian Process and ANOVA representation 2/37
3 f Introduction to Gaussian Process models Let f : D R be a function which value is known on a limited set of points X = (X 1,..., X n ) This situation may be found in many fields: Engineering Geostatistics Numerical simulator x 3/37
4 Introduction to Gaussian Process models Given a kernel K and its associated centered Gaussian Process Z, one can look at paths satisfying Z ω (X i ) = f (X i ) We can calculate the conditional law of Z knowing the observations x 4/37
5 Introduction to Gaussian Process models We approximate f (x) using the conditional expectation m(x) and the conditional variance v(x) of Z x m(x) = k(x) T K 1 F v(x) = K (x, x) k(x) T K 1 k(x) where k is a functional vector: (k(.)) i = K (X i,.) K is the covariance matrix (K) ij = K (X i, X j ) 5/37
6 Introduction to Gaussian Process models The best predictor m(x) can be seen as a linear combination of the K (X i,.) x x 6/37
7 Introduction to Gaussian Process models The choice of the covariance kernel K has a great impact on the model x Squared exponential Brownian Squared exponential (σ 2, θ) = (1, 0.2) (σ 2, θ) = (1, 0.5) x x How can we choose the most appropriate kernel? 7/37
8 Introduction to Gaussian Process models Definition: A kernel is a symmetric function of positive type: K (x, y) = K (y, x) n N, a 1,..., a n R et x 1,..., x n D, n n a i a j K (x i, x j ) 0. i=1 j=1 The verification of the second point is often intractable. 8/37
9 Introduction to Gaussian Process models However, if K 1, K 2 are kernels and f a real valued function, K 1 K 2, K 1 + K 2 and f (x)k 1 (x, y)f (y) are covariance kernels. Its easy to create new kernels from old! Example: Construction of multidimensional kernels K (x, y) = i K i (x i, y i ) 9/37
10 Limitations for tensor product kernels Most of the time, multidimensional kernels are based on the product of univariate kernels Example the squared exponential kernel over [0, 1] 2 is: K (x, y) = exp( x y 2 ) = exp( (x 1 y 1 ) 2 (x 2 y 2 ) 2 ) = K u (x 1, y 1 ) K u (x 2, y 2 ) With such kernels, it happens that an observation f (X i ) only has an influence on the neighborhood of X i. 10/37
11 Limitations for tensor product kernels For such kernels, the basis function K (X i, ) associated to an observation at X i only has a local influence Conversely to other methods where the basis functions have a global meaning, the kriging models are difficult to evaluate /37 X 1 = (0.25, 0.25)
12 Limitations for tensor product kernels If the neighborhood of an observation is a domain of size ε for d = 1, we then have when the dimension increases d = 1 ε d = 2 d = 3 ε ε This phenomena is known as the curse of dimensionality. the number of observations has to grow exponentially with d 12/37
13 Limitations for tensor product kernels When using usual kernels in high dimension, kriging emulators face 2 issues: They require a large number of observations They are difficult to interpret The aim of this presentation is to get round those issues Outlines: 1 Additive models 2 Simplified models with interaction terms 3 Interpretation of high-dimensional models 13/37
14 Additive kernels A popular approach to get round the curse of dimensionality is to consider simplified models such as additive models [Stone 85]: f (x) m(x) = d m i (x i ) i=1 Example of such models Regression without interaction Generalized additive models [Hastie 90] 14/37
15 Additive kernels In order to obtain additive kriging models, we considered kernels with expression K (x, y) = d K i (x i, y i ) i=1 As we said, such kernels are symmetric and of positive type. We will call such kernels additive kernels. 15/37
16 Y x2 Y x2 Y x2 Additive kernels Examples of sample paths of a GP Y with additive kernel x1 x1 x The paths of Y are additive (up to a modification). 16/37
17 Additive kernels Now, let us consider a GP model with additive kernel As an additive kernel is still a kernel, kriging equations do not change m(x) = k(x) T K 1 F v(x) = K (x, x) k(x) T K 1 k(x) 17/37
18 Interpretability of Additive kriging models When the input space is high-dimensional, usual kriging models cannot easily be interpreted They can be seen as black box. On the other hand, additive kriging models are easily interpretable m(x) = (k 1 (x 1 ) + k 2 (x 2 )) t (K 1 + K 2 ) 1 F = k 1 (x 1 ) t (K 1 + K 2 ) 1 F + k }{{} 2 (x 2 ) t (K 1 + K 2 ) 1 F }{{} m 1 (x 1 ) m 2 (x 2 ) For the sub-model m 1, the covariance matrix K 2 appears as a noise of observation. 18/37
19 Interpretability of Additive kriging models We obtain for the sub-models: m m x x2 m 1 (x 1 ) m 2 (x 2 ) 19/37
20 Additive models and linear budget As previously, if we look at the neighborhood of a point X i, we obtain The observations do not only have a local influence The number of observations can increase linearly with dimension 20/37
21 Additive models and linear budget Let Z p be a centered GP over [0, 1] d with tensor product kernel Z a be a centered GP over [0, 1] d with additive kernel Z = Z a + Z p (supposed to be independent) We compare the predictivity of the approximation of Z ω by An additive kriging model A kriging model based on a tensor product kernel when d increases 21/37
22 Additive models and linear budget For d 1,..., 30, we choose n = 10 d. We compare the percentage of variance explained by the two models: percentage of variance explained additive kernel tensor product kernel dimension θ = /37
23 Additive models and linear budget Those results depend on the value of θ: percentage of variance explained additive kernel tensor product kernel percentage of variance explained additive kernel tensor product kernel percentage of variance explained additive kernel tensor product kernel dimension dimension dimension θ = 0.25 θ = 0.5 θ = 1 With a limited linear budget, simple models can outperform more complex models 23/37
24 Additive models and linear budget We have seen 2 advantages of additive models they are easily interpretable they require a reasonable number of observations However, those models usually are too basic for modeling a real life phenomena How can we increase the complexity of the models? 24/37
25 ANOVA kernels In order to take into account the interactions between variables, one can consider ANOVA kernels [Stitson 97]: K (x, y) = d (1 + K i (x i, y i )) i=1 = 1 + d K i (x i, y i ) + K i (x i, y i )K j (x j, y j ) + + i=1 i<j }{{}}{{} additive part 2 nd order interactions d K i (x i, y i ) i=1 }{{} full interaction 25/37
26 ANOVA kernels A decomposition of the best predictor is naturally associated to those kernels. Example: we have in 2D K = 1 + K 1 + K 2 + K 1 K 2 so the best predictor can be written as m(x) = (1 + k 1 (x 1 ) + k 2 (x 2 ) + k 1 (x 1 )k 2 (x 2 )) t K 1 F = m 0 + m 1 (x 1 ) + m 2 (x 2 ) + m 12 (x) This decomposition looks like the ANOVA representation of m but the m I do not satisfy D i m I (x I )dx i = 0 26/37
27 ANOVA kernels The ANOVA representation is based on a functional decomposition of L 2 : if D = D 1 D d and µ = µ 1 µ d, we have L 2 (D, µ) = d (1 Di + L 2 0 (D i, µ i )) i=1 If we can build a RKHS with the same structure, the ANOVA representation of m can be obtained naturally How to build a RKHS of zero mean function? One example is given in [Wahba 97] 27/37
28 Kernel ANOVA Decomposition Using the RKHS framework, we showed that from any usual 1-dimensional kernel K we could extract a kernel K 0 associated to a RKHS of zero mean functions H 1 = span(r) Let R be the Riesz representant of.dx for.,. H. We define H 0 as R H H 0 28/37
29 Kernel ANOVA Decomposition The expression of R(x) can be obtained easily R(x) = R, K (x,.) H = K (x, s)ds D Brownian Gaussian with theta = 1 Gaussian with theta = 0.25 R(x) R(x) R(x) x x x 29/37
30 Kernel ANOVA Decomposition Finally, we have H = H 1 H 0 with H 1 = span(r) a one dimensional RKHS H 0 a RKHS of zero mean functions Those spaces have kernel: K (x, s)ds K (y, s)ds K 1 (x, y) = K (s, t)dsdt K (x, s)ds K (y, s)ds K 0 (x, y) = K (x, y) K (s, t)dsdt 30/37
31 Kernel ANOVA Decomposition As for the ANOVA representation in L 2, we can build a RKHS H H = K (x, y) = d (1 Di + Hi 0 (x i, y i )) i=1 d (1 + Ki 0 (x i, y i )) i=1 with this space, the ANOVA representation is obtained naturally m(x) = (1 + k 0 1 (x 1) + k 0 2 (x 2) + k 0 1 (x 1)k 0 2 (x 2)) t K 1 F = m 0 + m 1 (x 1 ) + m 2 (x 2 ) + m 12 (x) 31/37
32 Application 1: interpretation Let us consider the random test function f : [0, 1] 10 R : x 10 sin(πx 1 x 2 ) + 20(x 3 0.5) x 4 + 5x 5 + N (0, 1) The steps for approximating f with GPM are: 1 Learn f on a DoE (here LHS maximin with 180 points) 2 get the optimal values ψ for the kernel parameters using MLE, 3 build the kriging predictor ˆf based on K 0 32/37 As ˆf is a function of 10 variables, the model can not easily be represented: it is usually considered as a blackbox. However, the structure of K allows to split m in submodels.
33 Application 1: interpretation The univariate sub-models are: ( ) we had f (x) = 10 sin(πx 1 x 2 ) + 20(x 3 0.5) x 4 + 5x 5 + N (0, 1) 33/37
34 Application 2: computation of Sobol indices Using K, the sensitivity indices S I can be computed analytically: S I = var (m I(X I ) var (m(x)) = F T K 1 ( i I Γ i) K 1 F F T K 1 ( d i=1 (1 n n + Γ i ) 1 n n ) K 1 F where Γ i is the matrix Γ i = D i ki 0 (s i )ki 0 (s i ) T ds i, 1 n n is the matrix of 1 and where is a term wise product. Conversely to other methods, the computation of S I do not require to compute all S J for J I. 34/37
35 Conclusion Additive models Are useful for modeling high dimensional phenomenon Can be used for extracting an additive trend Kernels for sensitivity analyses K correspond to a particular class of ANOVA kernels they allows to obtain efficiently the terms of the ANOVA representation. This is useful for show the first terms for model interpretation compute the sensitivity analysis indices 35/37
36 Conclusion Perspectives: Confidence intervals for the sub-models RKHS orthogonal to other operators than Future work: Models with very high dimensional inputs kernels for taking into account specific features 36/37
37 Conclusion Thank you for your attention 37/37
Gaussian process regression for Sensitivity analysis
Gaussian process regression for Sensitivity analysis GPSS Workshop on UQ, Sheffield, September 2016 Nicolas Durrande, Mines St-Étienne, durrande@emse.fr GPSS workshop on UQ GPs for sensitivity analysis
More informationarxiv: v1 [stat.ml] 27 Nov 2011
ADDITIVE COVARIANCE KERNELS FOR HIGH-DIMENSIONAL GAUSSIAN PROCESS MODELING by arxiv:1111.6233v1 [stat.ml] 27 Nov 2011 Nicolas Durrande, David Ginsbourger & Olivier Roustant Abstract. Gaussian process models
More informationANOVA kernels and RKHS of zero mean functions for model-based sensitivity analysis
ANOVA kernels and RKHS of zero mean functions for model-based sensitivity analysis Nicolas Durrande, David Ginsbourger, Olivier Roustant, Laurent Carraro To cite this version: Nicolas Durrande, David Ginsbourger,
More informationGaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012
Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature
More informationOptimal Designs for Gaussian Process Models via Spectral Decomposition. Ofir Harari
Optimal Designs for Gaussian Process Models via Spectral Decomposition Ofir Harari Department of Statistics & Actuarial Sciences, Simon Fraser University September 2014 Dynamic Computer Experiments, 2014
More informationBootstrap & Confidence/Prediction intervals
Bootstrap & Confidence/Prediction intervals Olivier Roustant Mines Saint-Étienne 2017/11 Olivier Roustant (EMSE) Bootstrap & Confidence/Prediction intervals 2017/11 1 / 9 Framework Consider a model with
More informationEcon 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines
Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the
More informationCopula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011
Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011 Outline Ordinary Least Squares (OLS) Regression Generalized Linear Models
More informationDesign of experiments for smoke depollution of diesel engine outputs
ControlledCO 2 Diversifiedfuels Fuel-efficientvehicles Cleanrefining Extendedreserves Design of experiments for smoke depollution of diesel engine outputs M. CANAUD (1), F. WAHL (1), C. HELBERT (2), L.
More informationGaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008
Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:
More informationIntroduction to emulators - the what, the when, the why
School of Earth and Environment INSTITUTE FOR CLIMATE & ATMOSPHERIC SCIENCE Introduction to emulators - the what, the when, the why Dr Lindsay Lee 1 What is a simulator? A simulator is a computer code
More informationMulti-fidelity sensitivity analysis
Multi-fidelity sensitivity analysis Loic Le Gratiet 1,2, Claire Cannamela 2 & Bertrand Iooss 3 1 Université Denis-Diderot, Paris, France 2 CEA, DAM, DIF, F-91297 Arpajon, France 3 EDF R&D, 6 quai Watier,
More informationGAUSSIAN PROCESS REGRESSION
GAUSSIAN PROCESS REGRESSION CSE 515T Spring 2015 1. BACKGROUND The kernel trick again... The Kernel Trick Consider again the linear regression model: y(x) = φ(x) w + ε, with prior p(w) = N (w; 0, Σ). The
More informationComputer Vision Group Prof. Daniel Cremers. 4. Gaussian Processes - Regression
Group Prof. Daniel Cremers 4. Gaussian Processes - Regression Definition (Rep.) Definition: A Gaussian process is a collection of random variables, any finite number of which have a joint Gaussian distribution.
More informationKernels for Automatic Pattern Discovery and Extrapolation
Kernels for Automatic Pattern Discovery and Extrapolation Andrew Gordon Wilson agw38@cam.ac.uk mlg.eng.cam.ac.uk/andrew University of Cambridge Joint work with Ryan Adams (Harvard) 1 / 21 Pattern Recognition
More informationCSci 8980: Advanced Topics in Graphical Models Gaussian Processes
CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee November 15, 2007 Gaussian Processes Outline Gaussian Processes Outline Parametric Bayesian Regression Gaussian
More informationGaussian Processes for Machine Learning
Gaussian Processes for Machine Learning Carl Edward Rasmussen Max Planck Institute for Biological Cybernetics Tübingen, Germany carl@tuebingen.mpg.de Carlos III, Madrid, May 2006 The actual science of
More informationKernel methods and the exponential family
Kernel methods and the exponential family Stéphane Canu 1 and Alex J. Smola 2 1- PSI - FRE CNRS 2645 INSA de Rouen, France St Etienne du Rouvray, France Stephane.Canu@insa-rouen.fr 2- Statistical Machine
More informationLearning gradients: prescriptive models
Department of Statistical Science Institute for Genome Sciences & Policy Department of Computer Science Duke University May 11, 2007 Relevant papers Learning Coordinate Covariances via Gradients. Sayan
More informationSTA414/2104. Lecture 11: Gaussian Processes. Department of Statistics
STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations
More informationPolynomial chaos expansions for sensitivity analysis
c DEPARTMENT OF CIVIL, ENVIRONMENTAL AND GEOMATIC ENGINEERING CHAIR OF RISK, SAFETY & UNCERTAINTY QUANTIFICATION Polynomial chaos expansions for sensitivity analysis B. Sudret Chair of Risk, Safety & Uncertainty
More informationBasis Expansion and Nonlinear SVM. Kai Yu
Basis Expansion and Nonlinear SVM Kai Yu Linear Classifiers f(x) =w > x + b z(x) = sign(f(x)) Help to learn more general cases, e.g., nonlinear models 8/7/12 2 Nonlinear Classifiers via Basis Expansion
More informationComputer Vision Group Prof. Daniel Cremers. 9. Gaussian Processes - Regression
Group Prof. Daniel Cremers 9. Gaussian Processes - Regression Repetition: Regularized Regression Before, we solved for w using the pseudoinverse. But: we can kernelize this problem as well! First step:
More informationReduction of Model Complexity and the Treatment of Discrete Inputs in Computer Model Emulation
Reduction of Model Complexity and the Treatment of Discrete Inputs in Computer Model Emulation Curtis B. Storlie a a Los Alamos National Laboratory E-mail:storlie@lanl.gov Outline Reduction of Emulator
More informationComputer Intensive Methods in Mathematical Statistics
Computer Intensive Methods in Mathematical Statistics Department of mathematics johawes@kth.se Lecture 16 Advanced topics in computational statistics 18 May 2017 Computer Intensive Methods (1) Plan of
More informationGaussian Process Regression Networks
Gaussian Process Regression Networks Andrew Gordon Wilson agw38@camacuk mlgengcamacuk/andrew University of Cambridge Joint work with David A Knowles and Zoubin Ghahramani June 27, 2012 ICML, Edinburgh
More informationWhy experimenters should not randomize, and what they should do instead
Why experimenters should not randomize, and what they should do instead Maximilian Kasy Department of Economics, Harvard University Maximilian Kasy (Harvard) Experimental design 1 / 42 project STAR Introduction
More informationTilburg University. Efficient Global Optimization for Black-Box Simulation via Sequential Intrinsic Kriging Mehdad, Ehsan; Kleijnen, Jack
Tilburg University Efficient Global Optimization for Black-Box Simulation via Sequential Intrinsic Kriging Mehdad, Ehsan; Kleijnen, Jack Document version: Early version, also known as pre-print Publication
More informationManifold Learning: Theory and Applications to HRI
Manifold Learning: Theory and Applications to HRI Seungjin Choi Department of Computer Science Pohang University of Science and Technology, Korea seungjin@postech.ac.kr August 19, 2008 1 / 46 Greek Philosopher
More informationApproximation Theoretical Questions for SVMs
Ingo Steinwart LA-UR 07-7056 October 20, 2007 Statistical Learning Theory: an Overview Support Vector Machines Informal Description of the Learning Goal X space of input samples Y space of labels, usually
More informationLasso, Ridge, and Elastic Net
Lasso, Ridge, and Elastic Net David Rosenberg New York University February 7, 2017 David Rosenberg (New York University) DS-GA 1003 February 7, 2017 1 / 29 Linearly Dependent Features Linearly Dependent
More informationIntroduction to the Tensor Train Decomposition and Its Applications in Machine Learning
Introduction to the Tensor Train Decomposition and Its Applications in Machine Learning Anton Rodomanov Higher School of Economics, Russia Bayesian methods research group (http://bayesgroup.ru) 14 March
More informationKernel methods and the exponential family
Kernel methods and the exponential family Stephane Canu a Alex Smola b a 1-PSI-FRE CNRS 645, INSA de Rouen, France, St Etienne du Rouvray, France b Statistical Machine Learning Program, National ICT Australia
More informationCOMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017
COMS 4721: Machine Learning for Data Science Lecture 10, 2/21/2017 Prof. John Paisley Department of Electrical Engineering & Data Science Institute Columbia University FEATURE EXPANSIONS FEATURE EXPANSIONS
More informationGWAS V: Gaussian processes
GWAS V: Gaussian processes Dr. Oliver Stegle Christoh Lippert Prof. Dr. Karsten Borgwardt Max-Planck-Institutes Tübingen, Germany Tübingen Summer 2011 Oliver Stegle GWAS V: Gaussian processes Summer 2011
More informationMultivariate Random Variable
Multivariate Random Variable Author: Author: Andrés Hincapié and Linyi Cao This Version: August 7, 2016 Multivariate Random Variable 3 Now we consider models with more than one r.v. These are called multivariate
More informationModel Selection for Gaussian Processes
Institute for Adaptive and Neural Computation School of Informatics,, UK December 26 Outline GP basics Model selection: covariance functions and parameterizations Criteria for model selection Marginal
More informationStatistical Models and Methods for Computer Experiments
Statistical Models and Methods for Computer Experiments Olivier ROUSTANT Ecole des Mines de St-Etienne Habilitation à Diriger des Recherches 8 th November 2011 Outline Foreword 1. Computer Experiments:
More informationGaussian with mean ( µ ) and standard deviation ( σ)
Slide from Pieter Abbeel Gaussian with mean ( µ ) and standard deviation ( σ) 10/6/16 CSE-571: Robotics X ~ N( µ, σ ) Y ~ N( aµ + b, a σ ) Y = ax + b + + + + 1 1 1 1 1 1 1 1 1 1, ~ ) ( ) ( ), ( ~ ), (
More informationMachine Learning. B. Unsupervised Learning B.2 Dimensionality Reduction. Lars Schmidt-Thieme, Nicolas Schilling
Machine Learning B. Unsupervised Learning B.2 Dimensionality Reduction Lars Schmidt-Thieme, Nicolas Schilling Information Systems and Machine Learning Lab (ISMLL) Institute for Computer Science University
More informationDerivative-based global sensitivity measures for interactions
Derivative-based global sensitivity measures for interactions Olivier Roustant École Nationale Supérieure des Mines de Saint-Étienne Joint work with J. Fruth, B. Iooss and S. Kuhnt SAMO 13 Olivier Roustant
More informationCovariance function estimation in Gaussian process regression
Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian
More informationThe Kernel Trick. Robert M. Haralick. Computer Science, Graduate Center City University of New York
The Kernel Trick Robert M. Haralick Computer Science, Graduate Center City University of New York Outline SVM Classification < (x 1, c 1 ),..., (x Z, c Z ) > is the training data c 1,..., c Z { 1, 1} specifies
More informationPILCO: A Model-Based and Data-Efficient Approach to Policy Search
PILCO: A Model-Based and Data-Efficient Approach to Policy Search (M.P. Deisenroth and C.E. Rasmussen) CSC2541 November 4, 2016 PILCO Graphical Model PILCO Probabilistic Inference for Learning COntrol
More informationStatistics & Data Sciences: First Year Prelim Exam May 2018
Statistics & Data Sciences: First Year Prelim Exam May 2018 Instructions: 1. Do not turn this page until instructed to do so. 2. Start each new question on a new sheet of paper. 3. This is a closed book
More informationSequential approaches to reliability estimation and optimization based on kriging
Sequential approaches to reliability estimation and optimization based on kriging Rodolphe Le Riche1,2 and Olivier Roustant2 1 CNRS ; 2 Ecole des Mines de Saint-Etienne JSO 2012, ONERA Palaiseau 1 Intro
More informationCross Validation and Maximum Likelihood estimations of hyper-parameters of Gaussian processes with model misspecification
Cross Validation and Maximum Likelihood estimations of hyper-parameters of Gaussian processes with model misspecification François Bachoc Josselin Garnier Jean-Marc Martinez CEA-Saclay, DEN, DM2S, STMF,
More informationKernel Method: Data Analysis with Positive Definite Kernels
Kernel Method: Data Analysis with Positive Definite Kernels 2. Positive Definite Kernel and Reproducing Kernel Hilbert Space Kenji Fukumizu The Institute of Statistical Mathematics. Graduate University
More informationMachine Learning Srihari. Gaussian Processes. Sargur Srihari
Gaussian Processes Sargur Srihari 1 Topics in Gaussian Processes 1. Examples of use of GP 2. Duality: From Basis Functions to Kernel Functions 3. GP Definition and Intuition 4. Linear regression revisited
More informationReliability Monitoring Using Log Gaussian Process Regression
COPYRIGHT 013, M. Modarres Reliability Monitoring Using Log Gaussian Process Regression Martin Wayne Mohammad Modarres PSA 013 Center for Risk and Reliability University of Maryland Department of Mechanical
More informationLasso, Ridge, and Elastic Net
Lasso, Ridge, and Elastic Net David Rosenberg New York University October 29, 2016 David Rosenberg (New York University) DS-GA 1003 October 29, 2016 1 / 14 A Very Simple Model Suppose we have one feature
More informationComputer Vision Group Prof. Daniel Cremers. 2. Regression (cont.)
Prof. Daniel Cremers 2. Regression (cont.) Regression with MLE (Rep.) Assume that y is affected by Gaussian noise : t = f(x, w)+ where Thus, we have p(t x, w, )=N (t; f(x, w), 2 ) 2 Maximum A-Posteriori
More informationNonparmeteric Bayes & Gaussian Processes. Baback Moghaddam Machine Learning Group
Nonparmeteric Bayes & Gaussian Processes Baback Moghaddam baback@jpl.nasa.gov Machine Learning Group Outline Bayesian Inference Hierarchical Models Model Selection Parametric vs. Nonparametric Gaussian
More informationI L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN
Introduction Edps/Psych/Stat/ 584 Applied Multivariate Statistics Carolyn J Anderson Department of Educational Psychology I L L I N O I S UNIVERSITY OF ILLINOIS AT URBANA-CHAMPAIGN c Board of Trustees,
More informationGeostatistical Modeling for Large Data Sets: Low-rank methods
Geostatistical Modeling for Large Data Sets: Low-rank methods Whitney Huang, Kelly-Ann Dixon Hamil, and Zizhuang Wu Department of Statistics Purdue University February 22, 2016 Outline Motivation Low-rank
More informationMustafa H. Tongarlak Bruce E. Ankenman Barry L. Nelson
Proceedings of the 0 Winter Simulation Conference S. Jain, R. R. Creasey, J. Himmelspach, K. P. White, and M. Fu, eds. RELATIVE ERROR STOCHASTIC KRIGING Mustafa H. Tongarlak Bruce E. Ankenman Barry L.
More informationAdditive Gaussian Processes
Additive Gaussian Processes David Duvenaud Department of Engineering Cambridge University dkd3@cam.ac.uk Hannes Nickisch MPI for Intelligent Systems Tübingen, Germany hn@tue.mpg.de Carl Edward Rasmussen
More informationRegularization Methods for Additive Models
Regularization Methods for Additive Models Marta Avalos, Yves Grandvalet, and Christophe Ambroise HEUDIASYC Laboratory UMR CNRS 6599 Compiègne University of Technology BP 20529 / 60205 Compiègne, France
More informationStable Process. 2. Multivariate Stable Distributions. July, 2006
Stable Process 2. Multivariate Stable Distributions July, 2006 1. Stable random vectors. 2. Characteristic functions. 3. Strictly stable and symmetric stable random vectors. 4. Sub-Gaussian random vectors.
More informationIterative Gaussian Process Regression for Potential Energy Surfaces. Matthew Shelley University of York ISNET-5 Workshop 6th November 2017
Iterative Gaussian Process Regression for Potential Energy Surfaces Matthew Shelley University of York ISNET-5 Workshop 6th November 2017 Outline Motivation: Calculation of potential energy surfaces (PES)
More informationData-driven Kriging models based on FANOVA-decomposition
Data-driven Kriging models based on FANOVA-decomposition Thomas Muehlenstaedt, Olivier Roustant, Laurent Carraro, Sonja Kuhnt To cite this version: Thomas Muehlenstaedt, Olivier Roustant, Laurent Carraro,
More informationUnsupervised Regressive Learning in High-dimensional Space
Unsupervised Regressive Learning in High-dimensional Space University of Kent ATRC Leicester 31st July, 2018 Outline Data Linkage Analysis High dimensionality and variable screening Variable screening
More informationarxiv: v1 [stat.me] 24 May 2010
The role of the nugget term in the Gaussian process method Andrey Pepelyshev arxiv:1005.4385v1 [stat.me] 24 May 2010 Abstract The maximum likelihood estimate of the correlation parameter of a Gaussian
More informationComputer Vision Group Prof. Daniel Cremers. 6. Mixture Models and Expectation-Maximization
Prof. Daniel Cremers 6. Mixture Models and Expectation-Maximization Motivation Often the introduction of latent (unobserved) random variables into a model can help to express complex (marginal) distributions
More informationVariance Reduction and Ensemble Methods
Variance Reduction and Ensemble Methods Nicholas Ruozzi University of Texas at Dallas Based on the slides of Vibhav Gogate and David Sontag Last Time PAC learning Bias/variance tradeoff small hypothesis
More informationGaussian Random Variables Why we Care
Gaussian Random Variables Why we Care I Gaussian random variables play a critical role in modeling many random phenomena. I By central limit theorem, Gaussian random variables arise from the superposition
More informationConvergence Rates of Kernel Quadrature Rules
Convergence Rates of Kernel Quadrature Rules Francis Bach INRIA - Ecole Normale Supérieure, Paris, France ÉCOLE NORMALE SUPÉRIEURE NIPS workshop on probabilistic integration - Dec. 2015 Outline Introduction
More informationGaussian Process Regression
Gaussian Process Regression 4F1 Pattern Recognition, 21 Carl Edward Rasmussen Department of Engineering, University of Cambridge November 11th - 16th, 21 Rasmussen (Engineering, Cambridge) Gaussian Process
More informationIntroduction to Gaussian Processes
Introduction to Gaussian Processes Neil D. Lawrence GPSS 10th June 2013 Book Rasmussen and Williams (2006) Outline The Gaussian Density Covariance from Basis Functions Basis Function Representations Constructing
More informationSTATISTICAL LEARNING SYSTEMS
STATISTICAL LEARNING SYSTEMS LECTURE 8: UNSUPERVISED LEARNING: FINDING STRUCTURE IN DATA Institute of Computer Science, Polish Academy of Sciences Ph. D. Program 2013/2014 Principal Component Analysis
More informationMachine Learning for OR & FE
Machine Learning for OR & FE Supervised Learning: Regression I Martin Haugh Department of Industrial Engineering and Operations Research Columbia University Email: martin.b.haugh@gmail.com Some of the
More informationSpline Density Estimation and Inference with Model-Based Penalities
Spline Density Estimation and Inference with Model-Based Penalities December 7, 016 Abstract In this paper we propose model-based penalties for smoothing spline density estimation and inference. These
More informationREGRESSION WITH SPATIALLY MISALIGNED DATA. Lisa Madsen Oregon State University David Ruppert Cornell University
REGRESSION ITH SPATIALL MISALIGNED DATA Lisa Madsen Oregon State University David Ruppert Cornell University SPATIALL MISALIGNED DATA 10 X X X X X X X X 5 X X X X X 0 X 0 5 10 OUTLINE 1. Introduction 2.
More informationReview. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda
Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with
More informationKarhunen-Loève decomposition of Gaussian measures on Banach spaces
Karhunen-Loève decomposition of Gaussian measures on Banach spaces Jean-Charles Croix GT APSSE - April 2017, the 13th joint work with Xavier Bay. 1 / 29 Sommaire 1 Preliminaries on Gaussian processes 2
More informationComputer experiments with functional inputs and scalar outputs by a norm-based approach
Computer experiments with functional inputs and scalar outputs by a norm-based approach arxiv:1410.0403v1 [stat.me] 1 Oct 2014 Thomas Muehlenstaedt W. L. Gore & Associates and Jana Fruth Faculty of Statistics,
More informationReproducing Kernel Hilbert Spaces
Reproducing Kernel Hilbert Spaces Lorenzo Rosasco 9.520 Class 03 February 11, 2009 About this class Goal To introduce a particularly useful family of hypothesis spaces called Reproducing Kernel Hilbert
More informationNonparametric regression with martingale increment errors
S. Gaïffas (LSTA - Paris 6) joint work with S. Delattre (LPMA - Paris 7) work in progress Motivations Some facts: Theoretical study of statistical algorithms requires stationary and ergodicity. Concentration
More informationBayesian estimation of the discrepancy with misspecified parametric models
Bayesian estimation of the discrepancy with misspecified parametric models Pierpaolo De Blasi University of Torino & Collegio Carlo Alberto Bayesian Nonparametrics workshop ICERM, 17-21 September 2012
More informationAdditive Isotonic Regression
Additive Isotonic Regression Enno Mammen and Kyusang Yu 11. July 2006 INTRODUCTION: We have i.i.d. random vectors (Y 1, X 1 ),..., (Y n, X n ) with X i = (X1 i,..., X d i ) and we consider the additive
More informatione-companion ONLY AVAILABLE IN ELECTRONIC FORM Electronic Companion Stochastic Kriging for Simulation Metamodeling
OPERATIONS RESEARCH doi 10.187/opre.1090.0754ec e-companion ONLY AVAILABLE IN ELECTRONIC FORM informs 009 INFORMS Electronic Companion Stochastic Kriging for Simulation Metamodeling by Bruce Ankenman,
More informationIntroduction to Smoothing spline ANOVA models (metamodelling)
Introduction to Smoothing spline ANOVA models (metamodelling) M. Ratto DYNARE Summer School, Paris, June 215. Joint Research Centre www.jrc.ec.europa.eu Serving society Stimulating innovation Supporting
More informationKernel Principal Component Analysis
Kernel Principal Component Analysis Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr
More informationNonparametric Methods
Nonparametric Methods Michael R. Roberts Department of Finance The Wharton School University of Pennsylvania July 28, 2009 Michael R. Roberts Nonparametric Methods 1/42 Overview Great for data analysis
More information12 - Nonparametric Density Estimation
ST 697 Fall 2017 1/49 12 - Nonparametric Density Estimation ST 697 Fall 2017 University of Alabama Density Review ST 697 Fall 2017 2/49 Continuous Random Variables ST 697 Fall 2017 3/49 1.0 0.8 F(x) 0.6
More informationKarhunen-Loève decomposition of Gaussian measures on Banach spaces
Karhunen-Loève decomposition of Gaussian measures on Banach spaces Jean-Charles Croix jean-charles.croix@emse.fr Génie Mathématique et Industriel (GMI) First workshop on Gaussian processes at Saint-Etienne
More informationVariations. ECE 6540, Lecture 10 Maximum Likelihood Estimation
Variations ECE 6540, Lecture 10 Last Time BLUE (Best Linear Unbiased Estimator) Formulation Advantages Disadvantages 2 The BLUE A simplification Assume the estimator is a linear system For a single parameter
More information2018 2019 1 9 sei@mistiu-tokyoacjp http://wwwstattu-tokyoacjp/~sei/lec-jhtml 11 552 3 0 1 2 3 4 5 6 7 13 14 33 4 1 4 4 2 1 1 2 2 1 1 12 13 R?boxplot boxplotstats which does the computation?boxplotstats
More informationGaussian processes for inference in stochastic differential equations
Gaussian processes for inference in stochastic differential equations Manfred Opper, AI group, TU Berlin November 6, 2017 Manfred Opper, AI group, TU Berlin (TU Berlin) inference in SDE November 6, 2017
More informationPrediction by conditional simulation
Prediction by conditional simulation C. Lantuéjoul MinesParisTech christian.lantuejoul@mines-paristech.fr Introductive example Problem: A submarine cable has to be laid on the see floor between points
More informationKernel-based Approximation. Methods using MATLAB. Gregory Fasshauer. Interdisciplinary Mathematical Sciences. Michael McCourt.
SINGAPORE SHANGHAI Vol TAIPEI - Interdisciplinary Mathematical Sciences 19 Kernel-based Approximation Methods using MATLAB Gregory Fasshauer Illinois Institute of Technology, USA Michael McCourt University
More informationAn improved approach to estimate the hyper-parameters of the kriging model for high-dimensional problems through the Partial Least Squares method
An improved approach to estimate the hyper-parameters of the kriging model for high-dimensional problems through the Partial Least Squares method Mohamed Amine Bouhlel, Nathalie Bartoli, Abdelkader Otsmane,
More informationProbabilistic Machine Learning. Industrial AI Lab.
Probabilistic Machine Learning Industrial AI Lab. Probabilistic Linear Regression Outline Probabilistic Classification Probabilistic Clustering Probabilistic Dimension Reduction 2 Probabilistic Linear
More informationPrediction of double gene knockout measurements
Prediction of double gene knockout measurements Sofia Kyriazopoulou-Panagiotopoulou sofiakp@stanford.edu December 12, 2008 Abstract One way to get an insight into the potential interaction between a pair
More information3. Some tools for the analysis of sequential strategies based on a Gaussian process prior
3. Some tools for the analysis of sequential strategies based on a Gaussian process prior E. Vazquez Computer experiments June 21-22, 2010, Paris 21 / 34 Function approximation with a Gaussian prior Aim:
More informationHow to build an automatic statistician
How to build an automatic statistician James Robert Lloyd 1, David Duvenaud 1, Roger Grosse 2, Joshua Tenenbaum 2, Zoubin Ghahramani 1 1: Department of Engineering, University of Cambridge, UK 2: Massachusetts
More informationMathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio ( )
Mathematical Methods for Neurosciences. ENS - Master MVA Paris 6 - Master Maths-Bio (2014-2015) Etienne Tanré - Olivier Faugeras INRIA - Team Tosca October 22nd, 2014 E. Tanré (INRIA - Team Tosca) Mathematical
More informationHigh-dimensional regression modeling
High-dimensional regression modeling David Causeur Department of Statistics and Computer Science Agrocampus Ouest IRMAR CNRS UMR 6625 http://www.agrocampus-ouest.fr/math/causeur/ Course objectives Making
More informationIntroduction to Gaussian Process
Introduction to Gaussian Process CS 778 Chris Tensmeyer CS 478 INTRODUCTION 1 What Topic? Machine Learning Regression Bayesian ML Bayesian Regression Bayesian Non-parametric Gaussian Process (GP) GP Regression
More information