Introduction to Gaussian-process based Kriging models for metamodeling and validation of computer codes

Size: px
Start display at page:

Download "Introduction to Gaussian-process based Kriging models for metamodeling and validation of computer codes"

Transcription

1 Introduction to Gaussian-process based Kriging models for metamodeling and validation of computer codes François Bachoc Department of Statistics and Operations Research, University of Vienna (Former PhD student at CEA-Saclay, DEN, DM2S, STMF, LGLS, F Gif-Sur-Yvette, France and LPMA, Université Paris 7) LRC MANON - INSTN Saclay - March 2014 François Bachoc Introduction to Kriging models March / 55

2 1 Introduction to Gaussian processes 2 Kriging prediction Conditional distribution and Gaussian conditioning theorem Kriging prediction 3 Application to metamodeling of the GERMINAL code 4 Application to validation of the FLICA 4 thermal-hydraulic code François Bachoc Introduction to Kriging models March / 55

3 Random variables and vectors Remark A random Variable X is a random number, defined by a probability density function f X : R R + for which, for a, b R : b "probability of a X b" = f X (x)dx a Similarly a random Vector V = (V 1,..., V n) t is a vector of random variables. It is also defined by a probability density function f V : R n R + for which, for E R n : "probability of V E" = f V (v)dv E Naturally we have + f X (x)dx = R n f V (v)dv = 1 François Bachoc Introduction to Kriging models March / 55

4 Mean, variance and covariance The mean of a random variable X with density f X is denoted E(X) and is + E(X) = xf X (x)dx Let X be a random variable. The variance of X is denoted var(x) and is var(x) = E {(X E(X)) 2} var(x) is large X can be far from its mean more uncertainty. var(x) is small X is close to its mean less uncertainty. Let X, Y be two random variables. The covariance between X and Y is denoted cov(x, Y ) and is cov(x, Y ) = E {(X E(X))(Y E(Y ))} cov(x, Y ) var(x)var(y ) X and Y are almost proportional to one another. cov(x, Y ) << var(x)var(y ) X and Y are almost independent (when they are Gaussian). François Bachoc Introduction to Kriging models March / 55

5 Mean vector and covariance matrix Let V = (V 1,..., V n) t be a random vector. The mean vector of V is denoted E(V ) and is the n 1 vector defined by (E(V )) i = E(V i ) Let V = (V 1,..., V n) t be a random vector. The covariance matrix of V is denoted cov(v ) and is the n n matrix defined by (cov(v )) i,j = cov(v i, V j ) The diagonal terms show which components are the most uncertain. The non-diagonal terms show the dependence between the components. François Bachoc Introduction to Kriging models March / 55

6 Stochastic processes A stochastic process is a function Z : R n R such that Z (x) is a random variable. Alternatively a stochastic process is a function on R n that is unknown, or that depends of underlying random phenomena. We explicit the randomness of Z (x) by writing it Z (ω, x) with ω in a probability space Ω. For a given ω 0, we call the function x Z (ω 0, x) a realization of the stochastic process Z. Mean function M : x M(x) = E(Z (x)) Covariance function C : (x 1, x 2 ) C(x 1, x 2 ) = cov(z (x 1 ), Z (x 2 )) François Bachoc Introduction to Kriging models March / 55

7 Gaussian variables and vectors A random variable X is a Gaussian variable with mean µ and variance σ 2 > 0 when its probability density function is f µ,σ 2 (x) = 1 2πσ exp ( 1 (x µ)2 2σ2 A n-dimensional random vector V is a Gaussian vector with mean vector m and invertible covariance matrix R when its multidimensional probability density function is f m,r (v) = ( 1 (2π) n exp 1 ) 2 det(r) 2 (v m)t R 1 (v m) ) E.g. for Gaussian variables : µ and σ 2 are both parameters of the probability density function and the mean and variances of it. That is + xf µ,σ 2 (x)dx = µ and + (x µ)2 f µ,σ 2 (x)dx = σ 2 François Bachoc Introduction to Kriging models March / 55

8 Gaussian processes A stochastic process Z on R d is a Gaussian process when for all x 1,..., x n, the random vector (Z (x 1 ),..., Z (x n)) is Gaussian. A Gaussian process is characterized by its mean and covariance functions. Why are Gaussian processes convenient? Gaussian distribution is reasonable for modeling a large variety of random variables Gaussian processes are simple to define and simulate They are characterized by their mean and covariance functions As we will see, Gaussian properties simplify the resolution of problems Gaussian processes have been the most studied theoretically François Bachoc Introduction to Kriging models March / 55

9 Example of the Matérn 3 2 covariance function on R The Matérn 3 covariance function, for a 2 Gaussian process on R is parameterized by A variance parameter σ 2 > 0 A correlation length parameter l > 0 It is defined as ( C(x 1, x 2 ) = x ) 1 x 2 e 6 x 1 x 2 l l Interpretation cov l=0.5 l=1 l= x The Matérn 3 2 function is stationary : C(x 1 + h, x 2 + h) = c(x 1, x 2 ) The behavior of the corresponding Gaussian process is invariant by translation. σ 2 corresponds to the order of magnitude of the functions that are realizations of the Gaussian process l corresponds to the speed of variation of the functions that are realizations of the Gaussian process François Bachoc Introduction to Kriging models March / 55

10 y y y The Matérn 3 2 convariance function on R : illustration of l Plot of realizations of a Gaussian process having the Matérn 3 2 covariance function for σ2 = 1 and l = 0, 5, 1, 2 from left to right x x x François Bachoc Introduction to Kriging models March / 55

11 The Matérn 3 2 convariance function : generalization to Rd We now consider a Gaussian process on R d. The corresponding multidimensional Matérn 3 convariance function is parameterized by 2 A variance parameter σ 2 > 0 d correlation length parameters l 1 > 0,..., l d > 0 It is defined as C(x, y) = (1 + ) 6 x y l1,...,l d e 6 x y l1,...,l d with x y l1,...,l d = d (x i y i ) 2 l 2 i=1 i Interpretation Still stationary σ 2 still drives the order of magnitudes of the realizations l 1,..., l d correspond to the speed of variation of the realizations x Z (ω, x) when only the corresponding variable x 1,..., x d varies. when l i is particularly small, then the variable x i is particularly important hierarchy of the input variables x 1,...x d according to their correlation lengths l 1,..., l d François Bachoc Introduction to Kriging models March / 55

12 Summary A Gaussian process can be seen as a random phenomenon yielding realizations, i.e. specific functions R d R The standard probability tools enable to model and quantify the uncertainty we have on these realizations The choice of the covariance function (e.g. Matérn 3 ) enables to synthesize the information 2 we have (get) on the nature of the realizations with a small number of parameters François Bachoc Introduction to Kriging models March / 55

13 1 Introduction to Gaussian processes 2 Kriging prediction Conditional distribution and Gaussian conditioning theorem Kriging prediction 3 Application to metamodeling of the GERMINAL code 4 Application to validation of the FLICA 4 thermal-hydraulic code François Bachoc Introduction to Kriging models March / 55

14 Conditional probability density function Consider a partitioned random vector (Y 1, Y 2 ) t of size (n 1 + 1) 1, with probability density function f Y1,Y 2 : R n 1+1 R +. Then, Y 1 has the probability density function f Y1 (y 1 ) = R f Y 1,Y 2 (y 1, y 2 )dy 2. The conditional probability density function of Y 2 given Y 1 = y 1 is then Interpretation f Y2 Y 1 =y 1 (y 2 ) = f Y 1,Y 2 (y 1, y 2 ) f Y1 (y 1 ) It is the continuous generalization of the Bayes formula P(A B) = P(A, B) P(B) François Bachoc Introduction to Kriging models March / 55

15 Conditional mean Consider a partitioned random vector (Y 1, Y 2 ) t of size (n 1 + 1) 1, with conditional probability density function of Y 2 given Y 1 = y 1 given by f Y2 Y 1 =y 1 (y 2 ). Then the conditional mean of Y 2 given Y 1 = y 1 is E(Y 2 Y 1 = y 1 ) = y 2 f Y2 Y 1 =y 1 (y 2 )dy 2 R E(Y 2 Y 1 = y 1 ) is in fact a function of Y 1. Thus it is also a random variable. We emphasize this by writing E(Y 2 Y 1 ). Thus E(Y 2 Y 1 = y 1 ) is a realization of E(Y 2 Y 1 ). Optimality The function y 1 E(Y 2 Y 1 = y 1 ) is the best prediction of Y 2 we can make, when observing only Y 1. That is, for any function f : R n 1 R : { E (Y 2 f (Y 1 )) 2} E {(Y 2 E(Y 2 Y 1 )) 2} François Bachoc Introduction to Kriging models March / 55

16 Conditional variance Consider a partitioned random vector (Y 1, Y 2 ) t of size (n 1 + 1) 1, with conditional probability density function of Y 2 given Y 1 = y 1 given by f Y2 Y 1 =y 1 (y 2 ). Then the conditional variance of Y 2 given Y 1 = y 1 is var(y 2 Y 1 = y 1 ) = (y 2 E(Y 2 Y 1 = y 1 )) 2 f Y2 Y 1 =y 1 (y 2 )dy 2 R Summary The conditional mean E(Y 2 Y 1 ) is the best possible prediction of Y 2 given Y 1 The conditional probability density function y 2 f Y2 Y 1 =y 1 (y 2 ) can give the probability density function of the corresponding error ( most probable value, probability of threshold exceedance...) The conditional variance var(y 2 Y 1 = y 1 ) summarizes the order of magnitude of the prediction error François Bachoc Introduction to Kriging models March / 55

17 Gaussian conditioning theorem Theorem Let (Y 1, Y 2 ) t be a (n 1 + 1) 1 Gaussian vector with mean vector (m t 1, µ 2) t and covariance matrix ( R1 r 1,2 r t 1,2 σ 2 2 Then, conditionaly on Y 1 = y 1, Y 2 is a Gaussian vector with mean E(Y 2 Y 1 = y 1 ) = µ 2 + r t 1,2 R 1 1 (y 1 m 1 ) ) and variance var(y 2 Y 1 = y 1 ) = σ 2 2 r t 1,2 R 1 1 r 1,2 Illustration When (Y 1, Y 2 ) t be a 2 1 Gaussian vector with mean vector (µ 1, µ 2 ) t and covariance matrix ( 1 ρ ) ρ 1 Then E(Y 2 Y 1 = y 1 ) = µ 2 + ρ(y 1 µ 1 ) and var(y 2 Y 1 = y 1 ) = 1 ρ 2 François Bachoc Introduction to Kriging models March / 55

18 1 Introduction to Gaussian processes 2 Kriging prediction Conditional distribution and Gaussian conditioning theorem Kriging prediction 3 Application to metamodeling of the GERMINAL code 4 Application to validation of the FLICA 4 thermal-hydraulic code François Bachoc Introduction to Kriging models March / 55

19 A problem of function approximation We want to approximate a deterministic function, from a finite number of observed values of it. y x A possibility : deterministic approximation : polynomial regression, neural networks, splines, RKHS,... we can have a deterministic error bound With a Kriging model : stochastic method gives a stochastic error bound François Bachoc Introduction to Kriging models March / 55

20 Kriging model with Gaussian process realizations Kriging model : representing the deterministic and unknown function by a realization of a Gaussian process. y x Bayesian interpretation In statistics, a Bayesian model generally consists in representing a deterministic and unknown number by the realization of a random variable ( enables to incorporate expert knowledge, gives access to Bayes formula...). Here, we do the same with functions François Bachoc Introduction to Kriging models March / 55

21 Kriging prediction We let Y be the Gaussian process, on R d. Y is observed at x 1,..., x n R d. We consider here that we know the covariance function C of Y, and that the mean function of Y is zero Notations Let Y n = (Y (x 1 ),..., Y (x n)) t be the observation vector. It is a Gaussian vector Let R be the n n covariance matrix of Y n : (R) i,j = C(x i, x j ). Let x new R d be a new input point for the Gaussian process Y. We want to predict Y (x new ). Let r be the n 1 covariance vector between y and Y (x new ) : r i = C(x i, x new ) Then the Gaussian conditioning theorem gives the conditional mean of Y (x new ) given the observed values in Y n : ŷ(x new ) := E(Y (x new ) Y n) = r t R 1 Y n We also have the conditional variance : ˆσ 2 (x new ) := var(y (x new ) Y n) = C(x new, x new ) r t R 1 r François Bachoc Introduction to Kriging models March / 55

22 Kriging prediction : interpretation Exact reproduction of known values Assume, x new = x 1. Then, R i,1 = C(x i, x 1 ) = C(x i, x new ) = r i. Thus r t 1 Y (x 1 ) Y (x 1 ) r t R 1 Y n = r t.. = (1, 0,..., 0). = Y (x 1 ) Y (x n) Y (x n) Conservative extrapolation Let x new be far from x 1,..., x n. Then, we generally have r i = C(x i, x new ) 0. Thus ŷ(x new ) = r t R 1 Y n 0 and conservative ˆσ 2 (x new ) = C(x new, x new ) r t R 1 r C(x new, x new ) François Bachoc Introduction to Kriging models March / 55

23 Illustration of Kriging prediction observations François Bachoc Introduction to Kriging models March / 55

24 Illustration of Kriging prediction observations realizations from conditional distribution given Y n François Bachoc Introduction to Kriging models March / 55

25 Illustration of Kriging prediction observations realizations from conditional distribution given Y n conditional mean x new E(Y (x new ) Y n) François Bachoc Introduction to Kriging models March / 55

26 Illustration of Kriging prediction observations realizations from conditional distribution given Y n conditional mean x new E(Y (x new ) Y n) 95% confidence intervals François Bachoc Introduction to Kriging models March / 55

27 Kriging prediction with measure error It can be desirable not to reproduce the observed value exactly : when the observations comes from experiments variability of the response for a fixed input point even when the response is fixed for a given input point, it can vary strongly between very close input points Observations with measure error We consider that at x 1,..., x n, we observe Y (x 1 ) + ɛ 1,..., Y (x n) + ɛ n. ɛ 1,..., ɛ n are independent and are Gaussian variables, with mean 0 and known variance σ 2 mes. We Let Y n = (Y (x 1 ) + ɛ 1,..., Y (x n)ɛ n) t Then the Gaussian conditioning theorem still gives the conditional mean of Y (x new ) given the observed values in Y n : We also have the conditional variance : ŷ(x new ) := E(Y (x new ) Y n) = r t (R + σ 2 mes In) 1 Y n ˆσ 2 (x new ) := var(y (x new ) Y n) = C(x new, x new ) r t (R + σ 2 mes In) 1 r François Bachoc Introduction to Kriging models March / 55

28 Illustration of Kriging prediction with measure error observations François Bachoc Introduction to Kriging models March / 55

29 Illustration of Kriging prediction with measure error observations realizations from conditional distribution given Y n François Bachoc Introduction to Kriging models March / 55

30 Illustration of Kriging prediction with measure error observations realizations from conditional distribution given Y n conditional mean x new E(Y (x new ) Y n) François Bachoc Introduction to Kriging models March / 55

31 Illustration of Kriging prediction with measure error observations realizations from conditional distribution given Y n conditional mean x new E(Y (x new ) Y n) 95% confidence intervals François Bachoc Introduction to Kriging models March / 55

32 Covariance function estimation In most practical cases, the covariance function (x 1, x 2 ) C(x 1, x 2 ) is unknown. It is important to choose it correctly. In practice, it is first constrained in a parametric family of the form {C θ, θ Θ}, Θ R p E.g. the multidimensional Matérn 3 2 covariance function model on Rd, with θ = (σ 2, l 1,..., l d ) Then, most classically, the covariance parameter θ is automatically selected by Maximum Likelihood In the case without measure errors : Let Y n = (Y (x 1 ),..., Y (x n)) t be the n 1 observation vector Let R θ, be the n n covariance matrix of Y n, under covariance parameter θ : (R θ ) i,j = C θ (x i, x j ). The Maximum Likelihood estimator ˆθ ML of θ is then : ˆθ ML argmin θ Θ ( 1 (2π) n 2 det(rθ ) exp 1 ) 2 Y nr t 1 θ Yn We maximize the Gaussian probability density function of the observation vector, as a function of the covariance parameter Numerical optimization problem, where the cost function has a O(n 3 ) computational cost François Bachoc Introduction to Kriging models March / 55

33 Summary for Kriging prediction Most classical method : 1 From observed values gathered in the vector Y n 2 Choose a covariance function family, parameterized by θ Generally before investigating the observed values in detail and from a limited number of classical options (e.g. Matérn 3 2 ) 3 Optimize the Maximum Likelihood criterion w.r.t θ ˆθ ML Numerical optimization : gradient, quasi Newton, genetic algorithm... Potential condition-number problems 4 In the sequel, do as if the estimated covariance function C ˆθML (x 1, x 2 ) is the true covariance function (plug-in method). 5 Compute the conditional mean x new E(Y (x new ) Y n) and the conditional variance x new var(y (x new ) Y n) with explicit matrix vector formulas (Gaussian conditioning theorem) François Bachoc Introduction to Kriging models March / 55

34 1 Introduction to Gaussian processes 2 Kriging prediction Conditional distribution and Gaussian conditioning theorem Kriging prediction 3 Application to metamodeling of the GERMINAL code 4 Application to validation of the FLICA 4 thermal-hydraulic code François Bachoc Introduction to Kriging models March / 55

35 GERMINAL code : presentation Context GERMINAL code : simulation of the thermal-mechanical impact of the irradiation on a nuclear fuel pin Its utilization is part of a multi-physics and multi-objective optimization problem from reactor core design In collaboration with Karim Ammar (PhD student, CEA, DEN) François Bachoc Introduction to Kriging models March / 55

36 GERMINAL code : inputs and outputs 12 inputs x 1,..., x 12 [0, 1] (normalization) x 1, x 2 : Schedule parameters for the exploitation of the fuel pin x 3,..., x 8 : Nature parameters of the fuel pin (geometry, plutonium concentration) x 9, x 10, x 11 : parameters for the characterization of the power map in the fuel pin x 12 : disposal volume for the fission gas produced in the fuel pin 2 scalar variable of interests g 1 : initial temperature. Maximum, over space, of the temperature at the initial time. Rather simple to approximate g 2 : fusion-margin. Minimum difference, over space and time, of the fusion temperature of the fuel and the current temperature. More difficult to approximate general scheme 12 scalar inputs GERMINAL run spatio-temporal maps 2 scalar outputs We want to approximate 2 functions g 1, g 2 : R 12 R François Bachoc Introduction to Kriging models March / 55

37 GERMINAL code : data bases and measure error Data bases For the first output g 1, we have a learning base of n = points (15722 couples (x 1, g 1 (x 1 )),..., (x n, g 1 (x n)) with x i R 12 ). We have a test base of n test = 6521 elements. For the second output g 2, we have n = 3807 and n test = 1613 Measure errors The GERMINAL computation scheme (GERMINAL + pre and post-treatment) had not been used for so many inputs numerical instabilities (some very close inputs can give significantly distant outputs) we incorporate the measure error parameter σmes 2 to model numerical instabilities (estimated by Maximum Likelihood, together with covariance function parameters) François Bachoc Introduction to Kriging models March / 55

38 GERMINAL code : metamodels and performance indicators Metamodel A metamodel of g is a function ĝ : [0, 1] 12 R, that is built using the learning base only. We consider 2 metamodels : The Kriging conditional mean (with Matérn 3 covariance function and measure error variance 2 estimated by Maximum Likelihood) A neural-network method, of the uncertainty platform URANIE Once built, the cost of computing ĝ(x new ) for a new x new [0, 1] 12 is very small compared to a GERMINAL run. Error indicator Root Mean Square Error (RMSE) on the test base : RMSE = 1 n test (ĝ(x test,i ) g(x test,i )) 2 n test i=1 François Bachoc Introduction to Kriging models March / 55

39 Results For initial temperature g 1 (standard deviation of 344 ) estimated σ mes RMSE Kriging Neural networks 11.9 For Fusion Margin g 2 (standard deviation of 342 ) estimated σ mes RMSE Kriging Neural networks 39.7 Confirmation that output g 2 is more difficult to predict than g 1 In both cases, a significant part of the RMSE comes from the numerical instability, of order of magnitude σ mes The metamodels have overall quite good performances (3% and 10% relative error) The Kriging metamodel has here comparable to slightly larger accuracy than the neural networks On the other hand, the neural network metamodel is significantly faster than Kriging (computational cost in O(n) with n large). Nevertheless both metamodels can be considered as fast enough François Bachoc Introduction to Kriging models March / 55

40 1 Introduction to Gaussian processes 2 Kriging prediction Conditional distribution and Gaussian conditioning theorem Kriging prediction 3 Application to metamodeling of the GERMINAL code 4 Application to validation of the FLICA 4 thermal-hydraulic code François Bachoc Introduction to Kriging models March / 55

41 Computer code and physical system The computer code is represented by a function f : f : R d R m R (x, β) f (x, β) The physical system is represented by a function Y real. Y real : R d R x Y real (x) Y obs : x Y obs (x) := Y real (x) + ɛ(x) The inputs in x are the experimental conditions The inputs in β are the calibration parameters of the computer code The outputs f (x, β) and Y real (x) are the variable of interest Measure error ɛ(x) François Bachoc Introduction to Kriging models March / 55

42 Explaining discrepancies between simulation and experimental results We have carried out experiments Y obs (x 1 ),..., Y obs (x n). Discrepancies between simulations f (x i, β) and observations Y obs (x i ) can have 3 sources : misspecification of β measure errors on the observations Y obs (x i ) Errors on the specifications of the experimental conditions x i These 3 errors can insufficient to explain the differences between simulations and experiments François Bachoc Introduction to Kriging models March / 55

43 Gaussian process modeling of the model error Gaussian process model : unknown physical system represented by a realization of a Gaussian process Y real (x) = f (x, β) + Z (x) Y obs (x) = Y real (x) + ɛ(x) β : calibration parameter incorporation of expert knowledge with the Bayesian framework Z is the model error of the code. Z is modeled as the realization of a Gaussian process François Bachoc Introduction to Kriging models March / 55

44 Universal Kriging model Linear approximation of the code x : f (x, β) = m h i (x)β i small uncertainty on β Observations stem from a Gaussian process with linearly parameterized mean function with unknown coefficients universal Kriging model. 3 steps With similar matrix vector formula and interpretation as for the 0 mean function case : Estimation of the covariance function of Z Code calibration : conditional probability density function of β Prediction of the physical system : conditional mean E(Y real (x new ) Y n) and conditional variance var(y real (x new ) Y n) i=1 François Bachoc Introduction to Kriging models March / 55

45 Universal Kriging model : calibration and prediction formula (1/2) Y real is observed at x 1,..., x n R d. We consider here that we know the covariance function C of the model error Z Notations Let Y n = (Y real (x 1 ),..., Y real (x n)) t be the observation vector. It is a Gaussian vector Let R be the n n covariance matrix of (Z (x 1 ),..., Z (x n)) : (R) i,j = C(x i, x j ). Let x new R d be a new input point for the Gaussian process Y real. We want to predict Y (x new ). Let r be the n 1 covariance vector between Z (x 1 ),..., Z (x n) and Z (x new ) : r i = C(x i, x new ) Let H be the n m matrix of partial derivatives of f at x 1,..., x n : H i,j = h j (x i ) Let h be the m 1 vector of partial derivatives of f at X new : h i = h i (x new ) Let σ 2 mes be the variance of the measure error Then the Gaussian conditioning theorem gives the conditional mean of β given the observed values in Y n : β post := E(β Y n) = β prior + (Q 1 prior + HT (R + σ 2 mes In) 1 H) 1 H T (R + σ 2 mes In) 1 (Y n Hβ prior ). François Bachoc Introduction to Kriging models March / 55

46 Universal Kriging model : calibration and prediction formula (2/2) We also have the conditional mean of Y real (x new ) : ŷ real (x new ) := E(Y real (x new ) Y n) = h t β post + r t (R + σ 2 mes In) 1 (Y n Hβ post ) The conditional variance of Y real (x new ) is ˆσ 2 (x new ) := var(y real (x new ) Y n) = C(x new, x new ) r t (R + σ 2 mesi n) 1 r + (h H t (R + σ 2 mesi n) 1 h) t (H t (R + σ 2 mesi n) 1 H + Q 1 prior ) 1 (h H t (R + σ 2 mesi n) 1 r) Interpretation The prediction expression is decomposed into a calibration term and a Gaussian inference term of the model error When the code has a small error on the n observations, the prediction at x new uses almost only the calibrated code The conditional variance is larger than when the mean function is known François Bachoc Introduction to Kriging models March / 55

47 Application to FLICA 4 thermal-hydraulic code The experiment consists in pressurized and possibly heated water passing through a cylinder. We measure the pressure drop between the two ends of the cylinder. Quantity of interest : The part of the pressure drop due to friction : P fro Two kinds of experimental conditions : System parameters : Hydraulic diameter D h, Friction height H f, Channel width e Environment variables : Output pressure P s, Flowrate G e, Parietal heat flux Φ p, Liquid enthalpy he, l Thermodynamic title Xth e, Input temperature Te We have 253 experimental results François Bachoc Introduction to Kriging models March / 55

48 Friction model of FLICA 4 Code based on the (local) analytical model with P fro = f iso : Isothermal model. Parameterized a t and b t. f h : Monophasic model. Prior information case with β prior = ( H 2ρD h G 2 f iso f h. ) ( , Q prior = ) François Bachoc Introduction to Kriging models March / 55

49 Results We compare predictions to observations using Cross Validation We dispose of : The vector of posterior mean Pˆ fro of size n. The vector of posterior variance σpred 2 of size n. 2 quantitative criteria : 1 ( ) 2 RMSE : n n i=1 P fro,i ˆ P fro (x i ) Confidence Interval Reliability : proportion of observations that fall in the posterior 90% confidence interval. François Bachoc Introduction to Kriging models March / 55

50 Results with the thermal-hydraulic code Flica IV (2/2) RMSE 90% Confidence Interval Reliability Nominal code 661Pa 234/ Gaussian Processes 189Pa 235/ prediction error 90% confidence intervals prediction error 90% confidence intervals pressure drop pressure drop Index Index François Bachoc Introduction to Kriging models March / 55

51 Conclusion We can improve the predictions of a computer code by completing it with a Kriging model built with the experimental results The number of experimental results needs to be sufficient. No extrapolation For more details Bachoc F, Bois G, Garnier J and Martinez J.M, Calibration and improved prediction of computer models by universal Kriging, Nuclear Science and Engineering 176(1) (2014) François Bachoc Introduction to Kriging models March / 55

52 General conclusion Standard Kriging framework Versatile and easy-to-use statistical model We can incorporate a priori knowledge in the choice of the covariance function family After this choice, the standard method is rather automatic We associate confidence intervals to the predictions The Gaussian framework brings numerical criteria for the quality of the obtained model Extensions Kriging model can be goal-oriented : optimization, code validation, estimation of failure regions, global sensitivity analysis... Standard Kriging method can be computationally costly for large n approximate Kriging prediction and covariance function estimation is a current research domain François Bachoc Introduction to Kriging models March / 55

53 Some references (1/2) General books C.E. Rasmussen and C.K.I. Williams, Gaussian Processes for Machine Learning, The MIT Press, Cambridge (2006). Santner, T.J, Williams, B.J and Notz, W.I, The Design and Analysis of Computer Experiments Springer, New York (2003). M. Stein, Interpolation of Spatial Data : Some Theory for Kriging, Springer, New York (1999). On covariance function estimation F. Bachoc, Cross Validation and Maximum Likelihood Estimations of Hyper-parameters of Gaussian Processes with Model Mispecification, Computational Statistics and Data Analysis 66 (2013) F. Bachoc, Asymptotic analysis of the role of spatial sampling for covariance parameter estimation of Gaussian processes, Journal of Multivariate Analysis 125 (2014) Kriging for optimization D. Ginsbourger and V. Picheny, Noisy kriging-based optimization methods : a unified implementation within the DiceOptim package, Computational Statistics and Data Analysis 71 (2013) François Bachoc Introduction to Kriging models March / 55

54 Some references (2/2) Kriging for failure-domain estimation C. Chevalier, J. Bect, D. Ginsbourger, E. Vazquez, V. Picheny, and Y. Richet, Fast parallel kriging-based stepwise uncertainty reduction with application to the identification of an excursion set, Technometrics, (2014). Kriging with multifidelity computer codes L. Le Gratiet, Bayesian Analysis of Hierarchichal Multifidelity Codes, SIAM/ASA J. Uncertainty Quantification (2013). L. Le Gratiet and J. Garnier, Recursive co-kriging model for Design of Computer experiments with multiple levels of fidelity, International Journal for Uncertainty Quantification. Kriging for global sensitivity analysis A. Marrel, B. Iooss, S. da Veiga and M. Ribatet, Global sensitivity analysis of stochastic computer models with joint metamodels, Statistics and Computing, 22 (2012) Kriging for code validation S. Fu, Inversion probabiliste bayésienne en analyse d incertitude, PhD thesis, Université Paris-Sud 11. François Bachoc Introduction to Kriging models March / 55

55 Thank you for your attention! François Bachoc Introduction to Kriging models March / 55

Kriging models with Gaussian processes - covariance function estimation and impact of spatial sampling

Kriging models with Gaussian processes - covariance function estimation and impact of spatial sampling Kriging models with Gaussian processes - covariance function estimation and impact of spatial sampling François Bachoc former PhD advisor: Josselin Garnier former CEA advisor: Jean-Marc Martinez Department

More information

Estimation de la fonction de covariance dans le modèle de Krigeage et applications: bilan de thèse et perspectives

Estimation de la fonction de covariance dans le modèle de Krigeage et applications: bilan de thèse et perspectives Estimation de la fonction de covariance dans le modèle de Krigeage et applications: bilan de thèse et perspectives François Bachoc Josselin Garnier Jean-Marc Martinez CEA-Saclay, DEN, DM2S, STMF, LGLS,

More information

Cross Validation and Maximum Likelihood estimations of hyper-parameters of Gaussian processes with model misspecification

Cross Validation and Maximum Likelihood estimations of hyper-parameters of Gaussian processes with model misspecification Cross Validation and Maximum Likelihood estimations of hyper-parameters of Gaussian processes with model misspecification François Bachoc Josselin Garnier Jean-Marc Martinez CEA-Saclay, DEN, DM2S, STMF,

More information

arxiv: v2 [stat.ap] 26 Feb 2013

arxiv: v2 [stat.ap] 26 Feb 2013 Calibration and improved prediction of computer models by universal Kriging arxiv:1301.4114v2 [stat.ap] 26 Feb 2013 François Bachoc CEA-Saclay, DEN, DM2S, STMF, LGLS, F-91191 Gif-Sur-Yvette, France Laboratoire

More information

Covariance function estimation in Gaussian process regression

Covariance function estimation in Gaussian process regression Covariance function estimation in Gaussian process regression François Bachoc Department of Statistics and Operations Research, University of Vienna WU Research Seminar - May 2015 François Bachoc Gaussian

More information

Multi-fidelity sensitivity analysis

Multi-fidelity sensitivity analysis Multi-fidelity sensitivity analysis Loic Le Gratiet 1,2, Claire Cannamela 2 & Bertrand Iooss 3 1 Université Denis-Diderot, Paris, France 2 CEA, DAM, DIF, F-91297 Arpajon, France 3 EDF R&D, 6 quai Watier,

More information

Introduction to emulators - the what, the when, the why

Introduction to emulators - the what, the when, the why School of Earth and Environment INSTITUTE FOR CLIMATE & ATMOSPHERIC SCIENCE Introduction to emulators - the what, the when, the why Dr Lindsay Lee 1 What is a simulator? A simulator is a computer code

More information

Gaussian Process Regression

Gaussian Process Regression Gaussian Process Regression 4F1 Pattern Recognition, 21 Carl Edward Rasmussen Department of Engineering, University of Cambridge November 11th - 16th, 21 Rasmussen (Engineering, Cambridge) Gaussian Process

More information

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the

More information

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

PILCO: A Model-Based and Data-Efficient Approach to Policy Search PILCO: A Model-Based and Data-Efficient Approach to Policy Search (M.P. Deisenroth and C.E. Rasmussen) CSC2541 November 4, 2016 PILCO Graphical Model PILCO Probabilistic Inference for Learning COntrol

More information

Bayesian Machine Learning

Bayesian Machine Learning Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 2: Bayesian Basics https://people.orie.cornell.edu/andrew/orie6741 Cornell University August 25, 2016 1 / 17 Canonical Machine Learning

More information

Learning Gaussian Process Models from Uncertain Data

Learning Gaussian Process Models from Uncertain Data Learning Gaussian Process Models from Uncertain Data Patrick Dallaire, Camille Besse, and Brahim Chaib-draa DAMAS Laboratory, Computer Science & Software Engineering Department, Laval University, Canada

More information

Multi-fidelity co-kriging models

Multi-fidelity co-kriging models Application to Sequential design Loic Le Gratiet 12, Claire Cannamela 3 1 EDF R&D, Chatou, France 2 UNS CNRS, 69 Sophia Antipolis, France 3 CEA, DAM, DIF, F-91297 Arpajon, France ANR CHORUS April 3, 214

More information

Neutron inverse kinetics via Gaussian Processes

Neutron inverse kinetics via Gaussian Processes Neutron inverse kinetics via Gaussian Processes P. Picca Politecnico di Torino, Torino, Italy R. Furfaro University of Arizona, Tucson, Arizona Outline Introduction Review of inverse kinetics techniques

More information

Lecture : Probabilistic Machine Learning

Lecture : Probabilistic Machine Learning Lecture : Probabilistic Machine Learning Riashat Islam Reasoning and Learning Lab McGill University September 11, 2018 ML : Many Methods with Many Links Modelling Views of Machine Learning Machine Learning

More information

Monitoring Wafer Geometric Quality using Additive Gaussian Process

Monitoring Wafer Geometric Quality using Additive Gaussian Process Monitoring Wafer Geometric Quality using Additive Gaussian Process Linmiao Zhang 1 Kaibo Wang 2 Nan Chen 1 1 Department of Industrial and Systems Engineering, National University of Singapore 2 Department

More information

BEST ESTIMATE PLUS UNCERTAINTY SAFETY STUDIES AT THE CONCEPTUAL DESIGN PHASE OF THE ASTRID DEMONSTRATOR

BEST ESTIMATE PLUS UNCERTAINTY SAFETY STUDIES AT THE CONCEPTUAL DESIGN PHASE OF THE ASTRID DEMONSTRATOR BEST ESTIMATE PLUS UNCERTAINTY SAFETY STUDIES AT THE CONCEPTUAL DESIGN PHASE OF THE ASTRID DEMONSTRATOR M. Marquès CEA, DEN, DER F-13108, Saint-Paul-lez-Durance, France Advanced simulation in support to

More information

Estimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator

Estimation Theory. as Θ = (Θ 1,Θ 2,...,Θ m ) T. An estimator Estimation Theory Estimation theory deals with finding numerical values of interesting parameters from given set of data. We start with formulating a family of models that could describe how the data were

More information

Gaussian Processes for Machine Learning

Gaussian Processes for Machine Learning Gaussian Processes for Machine Learning Carl Edward Rasmussen Max Planck Institute for Biological Cybernetics Tübingen, Germany carl@tuebingen.mpg.de Carlos III, Madrid, May 2006 The actual science of

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University.

Summer School in Statistics for Astronomers V June 1 - June 6, Regression. Mosuk Chow Statistics Department Penn State University. Summer School in Statistics for Astronomers V June 1 - June 6, 2009 Regression Mosuk Chow Statistics Department Penn State University. Adapted from notes prepared by RL Karandikar Mean and variance Recall

More information

Bayesian Machine Learning

Bayesian Machine Learning Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 3 Stochastic Gradients, Bayesian Inference, and Occam s Razor https://people.orie.cornell.edu/andrew/orie6741 Cornell University August

More information

Machine learning - HT Maximum Likelihood

Machine learning - HT Maximum Likelihood Machine learning - HT 2016 3. Maximum Likelihood Varun Kanade University of Oxford January 27, 2016 Outline Probabilistic Framework Formulate linear regression in the language of probability Introduce

More information

Kriging by Example: Regression of oceanographic data. Paris Perdikaris. Brown University, Division of Applied Mathematics

Kriging by Example: Regression of oceanographic data. Paris Perdikaris. Brown University, Division of Applied Mathematics Kriging by Example: Regression of oceanographic data Paris Perdikaris Brown University, Division of Applied Mathematics! January, 0 Sea Grant College Program Massachusetts Institute of Technology Cambridge,

More information

Prediction of Data with help of the Gaussian Process Method

Prediction of Data with help of the Gaussian Process Method of Data with help of the Gaussian Process Method R. Preuss, U. von Toussaint Max-Planck-Institute for Plasma Physics EURATOM Association 878 Garching, Germany March, Abstract The simulation of plasma-wall

More information

Introduction to Gaussian Processes

Introduction to Gaussian Processes Introduction to Gaussian Processes Neil D. Lawrence GPSS 10th June 2013 Book Rasmussen and Williams (2006) Outline The Gaussian Density Covariance from Basis Functions Basis Function Representations Constructing

More information

Lecture 25: Review. Statistics 104. April 23, Colin Rundel

Lecture 25: Review. Statistics 104. April 23, Colin Rundel Lecture 25: Review Statistics 104 Colin Rundel April 23, 2012 Joint CDF F (x, y) = P [X x, Y y] = P [(X, Y ) lies south-west of the point (x, y)] Y (x,y) X Statistics 104 (Colin Rundel) Lecture 25 April

More information

arxiv: v1 [stat.me] 24 May 2010

arxiv: v1 [stat.me] 24 May 2010 The role of the nugget term in the Gaussian process method Andrey Pepelyshev arxiv:1005.4385v1 [stat.me] 24 May 2010 Abstract The maximum likelihood estimate of the correlation parameter of a Gaussian

More information

STAT 518 Intro Student Presentation

STAT 518 Intro Student Presentation STAT 518 Intro Student Presentation Wen Wei Loh April 11, 2013 Title of paper Radford M. Neal [1999] Bayesian Statistics, 6: 475-501, 1999 What the paper is about Regression and Classification Flexible

More information

Inference Control and Driving of Natural Systems

Inference Control and Driving of Natural Systems Inference Control and Driving of Natural Systems MSci/MSc/MRes nick.jones@imperial.ac.uk The fields at play We will be drawing on ideas from Bayesian Cognitive Science (psychology and neuroscience), Biological

More information

Introduction to Gaussian Processes

Introduction to Gaussian Processes Introduction to Gaussian Processes 1 Objectives to express prior knowledge/beliefs about model outputs using Gaussian process (GP) to sample functions from the probability measure defined by GP to build

More information

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr

Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen. Bayesian Learning. Tobias Scheffer, Niels Landwehr Universität Potsdam Institut für Informatik Lehrstuhl Maschinelles Lernen Bayesian Learning Tobias Scheffer, Niels Landwehr Remember: Normal Distribution Distribution over x. Density function with parameters

More information

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows.

Perhaps the simplest way of modeling two (discrete) random variables is by means of a joint PMF, defined as follows. Chapter 5 Two Random Variables In a practical engineering problem, there is almost always causal relationship between different events. Some relationships are determined by physical laws, e.g., voltage

More information

X t = a t + r t, (7.1)

X t = a t + r t, (7.1) Chapter 7 State Space Models 71 Introduction State Space models, developed over the past 10 20 years, are alternative models for time series They include both the ARIMA models of Chapters 3 6 and the Classical

More information

Uncertainty quantification and visualization for functional random variables

Uncertainty quantification and visualization for functional random variables Uncertainty quantification and visualization for functional random variables MascotNum Workshop 2014 S. Nanty 1,3 C. Helbert 2 A. Marrel 1 N. Pérot 1 C. Prieur 3 1 CEA, DEN/DER/SESI/LSMR, F-13108, Saint-Paul-lez-Durance,

More information

Statistical Models and Methods for Computer Experiments

Statistical Models and Methods for Computer Experiments Statistical Models and Methods for Computer Experiments Olivier ROUSTANT Ecole des Mines de St-Etienne Habilitation à Diriger des Recherches 8 th November 2011 Outline Foreword 1. Computer Experiments:

More information

CS 7140: Advanced Machine Learning

CS 7140: Advanced Machine Learning Instructor CS 714: Advanced Machine Learning Lecture 3: Gaussian Processes (17 Jan, 218) Jan-Willem van de Meent (j.vandemeent@northeastern.edu) Scribes Mo Han (han.m@husky.neu.edu) Guillem Reus Muns (reusmuns.g@husky.neu.edu)

More information

Model Selection for Gaussian Processes

Model Selection for Gaussian Processes Institute for Adaptive and Neural Computation School of Informatics,, UK December 26 Outline GP basics Model selection: covariance functions and parameterizations Criteria for model selection Marginal

More information

Reliability Monitoring Using Log Gaussian Process Regression

Reliability Monitoring Using Log Gaussian Process Regression COPYRIGHT 013, M. Modarres Reliability Monitoring Using Log Gaussian Process Regression Martin Wayne Mohammad Modarres PSA 013 Center for Risk and Reliability University of Maryland Department of Mechanical

More information

Parametric Techniques Lecture 3

Parametric Techniques Lecture 3 Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to

More information

Cross Validation and Maximum Likelihood estimation of hyper-parameters of Gaussian processes with model misspecification

Cross Validation and Maximum Likelihood estimation of hyper-parameters of Gaussian processes with model misspecification Cross Validation and Maximum Likelihood estimation of hyper-parameters of Gaussian processes with model misspecification François Bachoc To cite this version: François Bachoc. Cross Validation and Maximum

More information

Bayesian Machine Learning

Bayesian Machine Learning Bayesian Machine Learning Andrew Gordon Wilson ORIE 6741 Lecture 4 Occam s Razor, Model Construction, and Directed Graphical Models https://people.orie.cornell.edu/andrew/orie6741 Cornell University September

More information

Sequential Importance Sampling for Rare Event Estimation with Computer Experiments

Sequential Importance Sampling for Rare Event Estimation with Computer Experiments Sequential Importance Sampling for Rare Event Estimation with Computer Experiments Brian Williams and Rick Picard LA-UR-12-22467 Statistical Sciences Group, Los Alamos National Laboratory Abstract Importance

More information

Lecture 5: GPs and Streaming regression

Lecture 5: GPs and Streaming regression Lecture 5: GPs and Streaming regression Gaussian Processes Information gain Confidence intervals COMP-652 and ECSE-608, Lecture 5 - September 19, 2017 1 Recall: Non-parametric regression Input space X

More information

CSC2541 Lecture 2 Bayesian Occam s Razor and Gaussian Processes

CSC2541 Lecture 2 Bayesian Occam s Razor and Gaussian Processes CSC2541 Lecture 2 Bayesian Occam s Razor and Gaussian Processes Roger Grosse Roger Grosse CSC2541 Lecture 2 Bayesian Occam s Razor and Gaussian Processes 1 / 55 Adminis-Trivia Did everyone get my e-mail

More information

Algorithmisches Lernen/Machine Learning

Algorithmisches Lernen/Machine Learning Algorithmisches Lernen/Machine Learning Part 1: Stefan Wermter Introduction Connectionist Learning (e.g. Neural Networks) Decision-Trees, Genetic Algorithms Part 2: Norman Hendrich Support-Vector Machines

More information

Stochastic optimization - how to improve computational efficiency?

Stochastic optimization - how to improve computational efficiency? Stochastic optimization - how to improve computational efficiency? Christian Bucher Center of Mechanics and Structural Dynamics Vienna University of Technology & DYNARDO GmbH, Vienna Presentation at Czech

More information

Parametric Techniques

Parametric Techniques Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure

More information

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators

Estimation theory. Parametric estimation. Properties of estimators. Minimum variance estimator. Cramer-Rao bound. Maximum likelihood estimators Estimation theory Parametric estimation Properties of estimators Minimum variance estimator Cramer-Rao bound Maximum likelihood estimators Confidence intervals Bayesian estimation 1 Random Variables Let

More information

ABC random forest for parameter estimation. Jean-Michel Marin

ABC random forest for parameter estimation. Jean-Michel Marin ABC random forest for parameter estimation Jean-Michel Marin Université de Montpellier Institut Montpelliérain Alexander Grothendieck (IMAG) Institut de Biologie Computationnelle (IBC) Labex Numev! joint

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics

STA414/2104. Lecture 11: Gaussian Processes. Department of Statistics STA414/2104 Lecture 11: Gaussian Processes Department of Statistics www.utstat.utoronto.ca Delivered by Mark Ebden with thanks to Russ Salakhutdinov Outline Gaussian Processes Exam review Course evaluations

More information

[POLS 8500] Review of Linear Algebra, Probability and Information Theory

[POLS 8500] Review of Linear Algebra, Probability and Information Theory [POLS 8500] Review of Linear Algebra, Probability and Information Theory Professor Jason Anastasopoulos ljanastas@uga.edu January 12, 2017 For today... Basic linear algebra. Basic probability. Programming

More information

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Parameter Estimation, Correlations, and Error Bars. Department of Physics and Astronomy University of Rochester Physics 403 Parameter Estimation, Correlations, and Error Bars Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Review of Last Class Best Estimates and Reliability

More information

Lecture 2: Repetition of probability theory and statistics

Lecture 2: Repetition of probability theory and statistics Algorithms for Uncertainty Quantification SS8, IN2345 Tobias Neckel Scientific Computing in Computer Science TUM Lecture 2: Repetition of probability theory and statistics Concept of Building Block: Prerequisites:

More information

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee November 15, 2007 Gaussian Processes Outline Gaussian Processes Outline Parametric Bayesian Regression Gaussian

More information

Least Absolute Shrinkage is Equivalent to Quadratic Penalization

Least Absolute Shrinkage is Equivalent to Quadratic Penalization Least Absolute Shrinkage is Equivalent to Quadratic Penalization Yves Grandvalet Heudiasyc, UMR CNRS 6599, Université de Technologie de Compiègne, BP 20.529, 60205 Compiègne Cedex, France Yves.Grandvalet@hds.utc.fr

More information

How to build an automatic statistician

How to build an automatic statistician How to build an automatic statistician James Robert Lloyd 1, David Duvenaud 1, Roger Grosse 2, Joshua Tenenbaum 2, Zoubin Ghahramani 1 1: Department of Engineering, University of Cambridge, UK 2: Massachusetts

More information

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R

Random Variables. Random variables. A numerically valued map X of an outcome ω from a sample space Ω to the real line R In probabilistic models, a random variable is a variable whose possible values are numerical outcomes of a random phenomenon. As a function or a map, it maps from an element (or an outcome) of a sample

More information

Probabilistic Graphical Models Lecture 20: Gaussian Processes

Probabilistic Graphical Models Lecture 20: Gaussian Processes Probabilistic Graphical Models Lecture 20: Gaussian Processes Andrew Gordon Wilson www.cs.cmu.edu/~andrewgw Carnegie Mellon University March 30, 2015 1 / 53 What is Machine Learning? Machine learning algorithms

More information

On prediction and density estimation Peter McCullagh University of Chicago December 2004

On prediction and density estimation Peter McCullagh University of Chicago December 2004 On prediction and density estimation Peter McCullagh University of Chicago December 2004 Summary Having observed the initial segment of a random sequence, subsequent values may be predicted by calculating

More information

CSC321 Lecture 18: Learning Probabilistic Models

CSC321 Lecture 18: Learning Probabilistic Models CSC321 Lecture 18: Learning Probabilistic Models Roger Grosse Roger Grosse CSC321 Lecture 18: Learning Probabilistic Models 1 / 25 Overview So far in this course: mainly supervised learning Language modeling

More information

Covariance and Correlation

Covariance and Correlation Covariance and Correlation ST 370 The probability distribution of a random variable gives complete information about its behavior, but its mean and variance are useful summaries. Similarly, the joint probability

More information

Machine Learning Srihari. Probability Theory. Sargur N. Srihari

Machine Learning Srihari. Probability Theory. Sargur N. Srihari Probability Theory Sargur N. Srihari srihari@cedar.buffalo.edu 1 Probability Theory with Several Variables Key concept is dealing with uncertainty Due to noise and finite data sets Framework for quantification

More information

Econometrics I, Estimation

Econometrics I, Estimation Econometrics I, Estimation Department of Economics Stanford University September, 2008 Part I Parameter, Estimator, Estimate A parametric is a feature of the population. An estimator is a function of the

More information

1 EM algorithm: updating the mixing proportions {π k } ik are the posterior probabilities at the qth iteration of EM.

1 EM algorithm: updating the mixing proportions {π k } ik are the posterior probabilities at the qth iteration of EM. Université du Sud Toulon - Var Master Informatique Probabilistic Learning and Data Analysis TD: Model-based clustering by Faicel CHAMROUKHI Solution The aim of this practical wor is to show how the Classification

More information

Regression. Oscar García

Regression. Oscar García Regression Oscar García Regression methods are fundamental in Forest Mensuration For a more concise and general presentation, we shall first review some matrix concepts 1 Matrices An order n m matrix is

More information

2 Statistical Estimation: Basic Concepts

2 Statistical Estimation: Basic Concepts Technion Israel Institute of Technology, Department of Electrical Engineering Estimation and Identification in Dynamical Systems (048825) Lecture Notes, Fall 2009, Prof. N. Shimkin 2 Statistical Estimation:

More information

Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University

Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University Nonstationary spatial process modeling Part II Paul D. Sampson --- Catherine Calder Univ of Washington --- Ohio State University this presentation derived from that presented at the Pan-American Advanced

More information

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ).

x. Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ 2 ). .8.6 µ =, σ = 1 µ = 1, σ = 1 / µ =, σ =.. 3 1 1 3 x Figure 1: Examples of univariate Gaussian pdfs N (x; µ, σ ). The Gaussian distribution Probably the most-important distribution in all of statistics

More information

Uncertainty quantification and calibration of computer models. February 5th, 2014

Uncertainty quantification and calibration of computer models. February 5th, 2014 Uncertainty quantification and calibration of computer models February 5th, 2014 Physical model Physical model is defined by a set of differential equations. Now, they are usually described by computer

More information

Better Simulation Metamodeling: The Why, What and How of Stochastic Kriging

Better Simulation Metamodeling: The Why, What and How of Stochastic Kriging Better Simulation Metamodeling: The Why, What and How of Stochastic Kriging Jeremy Staum Collaborators: Bruce Ankenman, Barry Nelson Evren Baysal, Ming Liu, Wei Xie supported by the NSF under Grant No.

More information

Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model

Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model Tutorial on Gaussian Processes and the Gaussian Process Latent Variable Model (& discussion on the GPLVM tech. report by Prof. N. Lawrence, 06) Andreas Damianou Department of Neuro- and Computer Science,

More information

Gaussian process for nonstationary time series prediction

Gaussian process for nonstationary time series prediction Computational Statistics & Data Analysis 47 (2004) 705 712 www.elsevier.com/locate/csda Gaussian process for nonstationary time series prediction Soane Brahim-Belhouari, Amine Bermak EEE Department, Hong

More information

Today. Probability and Statistics. Linear Algebra. Calculus. Naïve Bayes Classification. Matrix Multiplication Matrix Inversion

Today. Probability and Statistics. Linear Algebra. Calculus. Naïve Bayes Classification. Matrix Multiplication Matrix Inversion Today Probability and Statistics Naïve Bayes Classification Linear Algebra Matrix Multiplication Matrix Inversion Calculus Vector Calculus Optimization Lagrange Multipliers 1 Classical Artificial Intelligence

More information

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall

Machine Learning. Gaussian Mixture Models. Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall Machine Learning Gaussian Mixture Models Zhiyao Duan & Bryan Pardo, Machine Learning: EECS 349 Fall 2012 1 The Generative Model POV We think of the data as being generated from some process. We assume

More information

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Elizabeth C. Mannshardt-Shamseldin Advisor: Richard L. Smith Duke University Department

More information

Kriging approach for the experimental cross-section covariances estimation

Kriging approach for the experimental cross-section covariances estimation EPJ Web of Conferences, () DOI:./ epjconf/ C Owned by te autors, publised by EDP Sciences, Kriging approac for te experimental cross-section covariances estimation S. Varet,a, P. Dossantos-Uzarralde, N.

More information

Multivariate probability distributions and linear regression

Multivariate probability distributions and linear regression Multivariate probability distributions and linear regression Patrik Hoyer 1 Contents: Random variable, probability distribution Joint distribution Marginal distribution Conditional distribution Independence,

More information

Physics 403. Segev BenZvi. Propagation of Uncertainties. Department of Physics and Astronomy University of Rochester

Physics 403. Segev BenZvi. Propagation of Uncertainties. Department of Physics and Astronomy University of Rochester Physics 403 Propagation of Uncertainties Segev BenZvi Department of Physics and Astronomy University of Rochester Table of Contents 1 Maximum Likelihood and Minimum Least Squares Uncertainty Intervals

More information

Some Concepts of Probability (Review) Volker Tresp Summer 2018

Some Concepts of Probability (Review) Volker Tresp Summer 2018 Some Concepts of Probability (Review) Volker Tresp Summer 2018 1 Definition There are different way to define what a probability stands for Mathematically, the most rigorous definition is based on Kolmogorov

More information

Introduction to Simple Linear Regression

Introduction to Simple Linear Regression Introduction to Simple Linear Regression Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Introduction to Simple Linear Regression 1 / 68 About me Faculty in the Department

More information

Assessment of uncertainty in computer experiments: from Universal Kriging to Bayesian Kriging. Céline Helbert, Delphine Dupuy and Laurent Carraro

Assessment of uncertainty in computer experiments: from Universal Kriging to Bayesian Kriging. Céline Helbert, Delphine Dupuy and Laurent Carraro Assessment of uncertainty in computer experiments: from Universal Kriging to Bayesian Kriging., Delphine Dupuy and Laurent Carraro Historical context First introduced in the field of geostatistics (Matheron,

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

Chapter 4 - Fundamentals of spatial processes Lecture notes

Chapter 4 - Fundamentals of spatial processes Lecture notes TK4150 - Intro 1 Chapter 4 - Fundamentals of spatial processes Lecture notes Odd Kolbjørnsen and Geir Storvik January 30, 2017 STK4150 - Intro 2 Spatial processes Typically correlation between nearby sites

More information

Spatial Inference of Nitrate Concentrations in Groundwater

Spatial Inference of Nitrate Concentrations in Groundwater Spatial Inference of Nitrate Concentrations in Groundwater Dawn Woodard Operations Research & Information Engineering Cornell University joint work with Robert Wolpert, Duke Univ. Dept. of Statistical

More information

Gaussian Processes (10/16/13)

Gaussian Processes (10/16/13) STA561: Probabilistic machine learning Gaussian Processes (10/16/13) Lecturer: Barbara Engelhardt Scribes: Changwei Hu, Di Jin, Mengdi Wang 1 Introduction In supervised learning, we observe some inputs

More information

Probability and Information Theory. Sargur N. Srihari

Probability and Information Theory. Sargur N. Srihari Probability and Information Theory Sargur N. srihari@cedar.buffalo.edu 1 Topics in Probability and Information Theory Overview 1. Why Probability? 2. Random Variables 3. Probability Distributions 4. Marginal

More information

Algorithms for Uncertainty Quantification

Algorithms for Uncertainty Quantification Algorithms for Uncertainty Quantification Tobias Neckel, Ionuț-Gabriel Farcaș Lehrstuhl Informatik V Summer Semester 2017 Lecture 2: Repetition of probability theory and statistics Example: coin flip Example

More information

Machine Learning. Bayesian Regression & Classification. Marc Toussaint U Stuttgart

Machine Learning. Bayesian Regression & Classification. Marc Toussaint U Stuttgart Machine Learning Bayesian Regression & Classification learning as inference, Bayesian Kernel Ridge regression & Gaussian Processes, Bayesian Kernel Logistic Regression & GP classification, Bayesian Neural

More information

Pattern Recognition and Machine Learning. Bishop Chapter 6: Kernel Methods

Pattern Recognition and Machine Learning. Bishop Chapter 6: Kernel Methods Pattern Recognition and Machine Learning Chapter 6: Kernel Methods Vasil Khalidov Alex Kläser December 13, 2007 Training Data: Keep or Discard? Parametric methods (linear/nonlinear) so far: learn parameter

More information

Hierarchical sparse Bayesian learning for structural health monitoring. with incomplete modal data

Hierarchical sparse Bayesian learning for structural health monitoring. with incomplete modal data Hierarchical sparse Bayesian learning for structural health monitoring with incomplete modal data Yong Huang and James L. Beck* Division of Engineering and Applied Science, California Institute of Technology,

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Structural Reliability

Structural Reliability Structural Reliability Thuong Van DANG May 28, 2018 1 / 41 2 / 41 Introduction to Structural Reliability Concept of Limit State and Reliability Review of Probability Theory First Order Second Moment Method

More information

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

Statistical Data Mining and Machine Learning Hilary Term 2016

Statistical Data Mining and Machine Learning Hilary Term 2016 Statistical Data Mining and Machine Learning Hilary Term 2016 Dino Sejdinovic Department of Statistics Oxford Slides and other materials available at: http://www.stats.ox.ac.uk/~sejdinov/sdmml Naïve Bayes

More information

Tokamak profile database construction incorporating Gaussian process regression

Tokamak profile database construction incorporating Gaussian process regression Tokamak profile database construction incorporating Gaussian process regression A. Ho 1, J. Citrin 1, C. Bourdelle 2, Y. Camenen 3, F. Felici 4, M. Maslov 5, K.L. van de Plassche 1,4, H. Weisen 6 and JET

More information

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas

Density estimation Nonparametric conditional mean estimation Semiparametric conditional mean estimation. Nonparametrics. Gabriel Montes-Rojas 0 0 5 Motivation: Regression discontinuity (Angrist&Pischke) Outcome.5 1 1.5 A. Linear E[Y 0i X i] 0.2.4.6.8 1 X Outcome.5 1 1.5 B. Nonlinear E[Y 0i X i] i 0.2.4.6.8 1 X utcome.5 1 1.5 C. Nonlinearity

More information

A Gaussian state-space model for wind fields in the North-East Atlantic

A Gaussian state-space model for wind fields in the North-East Atlantic A Gaussian state-space model for wind fields in the North-East Atlantic Julie BESSAC - Université de Rennes 1 with Pierre AILLIOT and Valï 1 rie MONBET 2 Juillet 2013 Plan Motivations 1 Motivations 2 Context

More information