Bayesian inference of random fields represented with the Karhunen-Loève expansion

Size: px
Start display at page:

Download "Bayesian inference of random fields represented with the Karhunen-Loève expansion"

Transcription

1 Bayesian inference of random fields represented with the Karhunen-Loève expansion Felipe Uribe a,, Iason Papaioannou a, Wolfgang Betz a, Daniel Straub a a Engineering Risk Analysis Group, Technische Universität München. Arcisstraße, 8333 München, Germany. Abstract The integration of data into engineering models involving uncertain and spatially varying parameters is oftentimes key to obtaining accurate predictions. Bayesian inference is effective in achieving such an integration. Uncertainties related to spatially varying parameters are typically represented through random fields discretized into a finite number of random variables. The prior correlation length and variance of the field, as well as the number of terms used in the random field discretization, have a considerable impact on the outcome of the Bayesian inference which however has received little attention in the literature. Here, we investigate the implications of different choices in the prior random field model on the outcome of the Bayesian inference. We employ the Karhunen-Loève expansion for the representation of the random fields. We show that a higher-order Karhunen-Loève discretization is required in Bayesian inverse problems, as compared to standard prior uncertainty propagation. Furthermore, the smoothing effect of the forward operator has a large influence on the posterior solution, specially when the quantity of interest is sensitive to local random fluctuations of the inverse quantity. This is also reflected in the magnitude of updated rare event probabilities. We illustrate these effects analytically through a D cantilever beam with spatially varying flexibility, and numerically using a D linear elasticity example where the Young s modulus is spatially variable. Keywords: uncertainty quantification, inverse problems, Bayesian inference, random fields, Karhunen-Loève expansion, reliability updating.. Introduction In science and engineering, physical systems are typically modeled by partial differential equations (PDEs). Such PDEs require a proper description of the underlying system inputs and parameters. In practice, there is often significant uncertainty about the actual value of these properties. The associated uncertainties can be reduced through measurements of the system response. If these observations are combined with the PDE model, information about the uncertain system inputs and parameters can be retrieved. This inference process is referred to as inverse problem. The stability of the inverse problem is mainly controlled by the dimension of the parameter space, the structure of the PDE, and the observations, which are in most cases scarce and noisy. Hence, inverse problems are typically ill-posed. This means that different values of model parameters are consistent with the data or that the parameters cannot be identified at all. Bayesian statistical methods provide a tool to regularize the problem by incorporating a probabilistic description of the model parameters that combines prior information with observations [, ]. In this case, the objective is to estimate the posterior probability density function (PDF) of the model parameters. Closed-form expressions for the posterior density are only available for some Corresponding author. addresses: felipe.uribe@tum.de (Felipe Uribe), iason.papaioannou@tum.de (Iason Papaioannou), wolfgang.betz@tum.de (Wolfgang Betz), straub@tum.de (Daniel Straub) Preprint submitted to Computer Methods in Applied Mechanics and Engineering November 3, 8

2 particular cases. Hence, Bayesian inverse problems are generally solved using sampling-based methods, such as Markov chain Monte Carlo (MCMC), importance sampling [3], sequential Monte Carlo [4], structural reliability methods [5], or approximation methods such as transport maps [6], variational inference [7], among others. An additional level of complexity is present when the unknown model parameters fluctuate randomly in space. A common example arises in continuum mechanics where material parameters are spatially variable, such as Young s modulus in elasticity theory [8], conduction and convection coefficients in heat transfer problems [9], or permeability fields in hydraulic tomography applications []. The uncertainty related to spatially varying properties is generally represented by random fields. This mathematical object implies the generation of an infinite-dimensional collection of random variables indexed at spatial coordinates of the continuous domain of the system. For Bayesian inverse problems that are solved numerically, the infinite-dimensional parameter space needs to be projected onto a suitable finite-dimensional one. The Karhunen-Loève (K-L) expansion is a random field discretization approach that is optimal in the meansquared-error sense as compared to any other spectral projection algorithm. This method employs the eigenvalues and eigenfunctions of the autocovariance operator describing the random field, to construct a series expansion with random coefficients. In practice, it is common to truncate the K-L expansion after a finite number of terms. Thereafter, the uncertain parameters associated to the full random field are replaced by the coefficients of the truncated expansion; reducing the dimensionality of the inverse problem. We remark that alternative approaches to dimensionality reduction are also used in the context of Bayesian inversion, as in the case of wavelet-based parametrization [], likelihood-informed subspaces [], active subspaces [3], or the approach recently proposed in [4]. A main challenge in Bayesian inference of random fields is the choice of the prior distribution for the parameters that generate the field. Commonly, the number of terms used in the random field discretization is fixed, as is the correlation length and variance of the field. These quantities have a considerable impact on the random field representation and, consequently, on the Bayesian inversion. The difficulty of selecting appropriate prior distributions for random fields has fostered research on hierarchical Bayesian approaches. In this regard, Marzouk and Najm [5] applied the K-L expansion with a hierarchical Gaussian process prior using the mean and variance of the field as hyperparameters; the full mathematical model is replaced by a polynomial chaos surrogate yielding an efficient evaluation of the likelihood function. They also performed an error analysis on the K-L approximation of the posterior random fields. Tagade and Choi [6] extended the approach in [5] using a larger hierarchical structure, where the correlation length was also part of the inference process. Cotter et al. [7] generalized several MCMC algorithms to the realm of functions. In particular, they proposed a Metropolis-within-Gibbs algorithm to infer both the random coefficients of the K-L expansion and its truncation order. Mondal et al. [8] also performed inference on the number of terms in the K-L expansion by applying the reversible jump Markov chain Monte Carlo algorithm [9]. Sraj et al. [] included the correlation length of the field as hyperparameter. They proposed a parametrized autocovariance function to reduce the computational cost associated to the repeated solution of the eigenvalue problem required by the sample-based inference process. Moreover, Roininen et al. [] applied Cauchy and Gaussian hyperpriors to the correlation length of the field using non-homogeneous Matérn covariance kernels. They used a combined Gibbs and Metropolis-within-Gibbs algorithm for the solution of the hierarchical Bayesian inverse problem. Recently, Fuglstad et al. [] used the concept of penalized complexity priors proposed in [3] to derive a joint prior for the variance and correlation length of Gaussian random fields; they also provided guidelines for selecting the hyperparameters and priors for non-homogeneous random fields. Latz et al. [4] proposed a Metropolis-within-Gibbs algorithm to infer jointly a parameterized Gaussian random field and its correlation length; they applied a reduced basis algorithm to decrease the computational cost of the simulation. Despite extensive research on the development of numerical methods for Bayesian inference of random fields, the influence of the random field discretization on the solution of the inverse problem has received little attention. Li [5] derived an error bound between the maximum a posteriori estimator and the truncated K-L representation in terms of the eigenvalues of the prior covariance. Spantini et al. [6] pointed out that in order to avoid large truncation errors in the posterior solution associated to the K-L discretization, the prior distribution needs to impose significant smoothness on the parameters (i.e., the eigenvalues of the prior

3 covariance decay fast). In this paper the effect of the K-L discretization on the Bayesian inverse problem solution is investigated. We extend the analysis of [7] by showing analytically and numerically the influence of different prior assumptions on the posterior solution. We perform two studies: (i) a one-dimensional example, for which closed-form expressions of the posterior random field can be derived. A parametric study to evaluate the influence of the prior correlation length, autocovariance kernel of the field, and number of terms in the K-L expansion on the posterior random field is carried out. Different sets of observations are also considered in order to assess the influence of the number of measurement points on the random field updating. The analytical expressions enable us to perform a systematic error analysis of the posterior mean and variance approximations. Furthermore, for a given parameter setting, we perform model selection on different truncation orders in the K-L expansion. This allows us to evaluate whether a larger number of K-L terms is required for the solution of the Bayesian inverse problem as compared to the forward problem solution; (ii) a two-dimensional numerical example is used to study the smoothing effect of the forward operator on different quantities of interest (QoI). In this case, the Bayesian inverse problem is solved using the BUS (Bayesian updating with structural reliability methods) approach proposed in [5, 8]. The identified random field is then employed to evaluate the influence of the random field discretization on the updating of rare event probabilities using the approach discussed in [9]. The remainder of this work is structured as follows: in, a brief summary of random fields and the K-L expansion is presented; Whittle-Matérn covariance kernels and error measures used for the random field discretization are also introduced. In 3, the Bayesian approach to inverse problems in the context of random fields is formulated; furthermore, the principles of the BUS approach are described. Next, the influence of the random field discretization on the posterior random field is demonstrated by means of analytical and numerical experiments in 4. The main findings of the study are summarized in 5. The paper finalizes with the conclusions.. Modeling and representation of random fields Random fields provide an effective tool for the modeling of system inputs and parameters that fluctuate continuously through space. The following discussion follows the expositions of Adler [3] and Grigoriu [3]... Definition of a random field Let (Ω, F, P) be a probability space, D R d a bounded index set representing a physical domain, and L (Ω, P) the Hilbert space of second-order random variables (finite variance), with the inner product X, Y = E[XY ] (for X, Y L (Ω, P)). A random field can be understood as a function H(x, ω) : D Ω R, with arguments x D a spatial coordinate and ω Ω a generic outcome of the sample space [3]. Intuitively, a random field is a collection of random variables representing uncertain values at each spatial coordinate of D. Notice that if D is uncountable, it is not possible to specify the joint distribution of all random variables defining the random field. Hence, from a modeling perspective, a random field is characterized in terms of its finite-dimensional (fi-di) distributions (general definitions are given in [3, 3]). Consider the finite set of points x = {x,..., x n x i D, i =,..., n}, associated with a set of random variables H(x, ω) = {H(x, ω),..., H(x n, ω) H(x i, ω) L (Ω, P)}, with joint distribution F H (y) = P[H(x, ω) y], called the n-th order fi-di distribution of the random field [3]. A random field is defined by its family of fi-di distributions provided they exist and satisfy Kolmogorov s conditions of consistency and symmetry (see [3, p.3]). A random field is Gaussian if its fi-di distributions are multivariate Gaussian for any x D [3]. Gaussian random fields are completely characterized by their first- and second-order moments, i.e., the mean function µ H (x) = E[H(x, ω)] and the autocovariance function C HH (x, x ) = E[(H(x, ω) µ H (x))(h(x, ω) µ H (x ))] = σ H (x)σ H (x )R HH (x, x ), with σ H (x), σ H (x ) and R HH (x, x ) the standard deviation and autocorrelation functions of the field. Moreover, a random field is said to be homogeneous if the associated fi-di distributions of the field are invariant under arbitrary shifts in space d = x x ; and the field is weakly homogeneous if the mean function µ H (x) = µ H is space-invariant and the autocovariance function only 3

4 depends on the shift, i.e., C HH (x, x ) = C HH (d) [3]. Further, if the autocovariance function is independent of the direction, i.e., if it is a function of the Euclidean norm d = x x, the random field is isotropic. It is clear that a proper definition of a random field will imply the construction of a fi-di distribution family with n. Such a theoretical description is not commonly used in practice, since it is not feasible to collect sufficient data to verify the assumed probabilistic models. Hence, the process of representing a continuous-parameter random field in terms of a finite set of random variables requires the use of stochastic discretization schemes (e.g., [3]). Among these representation techniques, methods based on finite expansions of random variables and deterministic functions are popular. These include the Karhunen-Loève expansion [33, 34], which expresses a random field as a linear combination of orthogonal functions chosen as the eigenfunctions resulting from the spectral decomposition of the autocovariance function of the field... Karhunen-Loève expansion Let us consider a real-valued random field H(x, ω) with continuous mean µ H (x) : D R and autocovariance functions C HH (x, x ) : D D R. Autocovariance functions belong to the class of Hilbert-Schmidt kernels (functions with finite L -norm), which are symmetric and positive-semidefinite [3]. These properties guarantee the existence of an orthonormal basis consisting of the eigenfunctions of the associated covariance operator, such that the sequence of corresponding eigenvalues is real and non-negative [35]. Following Mercer s theorem, the autocovariance kernel can be represented by a series expansion based on the spectral representation of the covariance operator [36, p.48], C HH (x, x ) = λ k φ k (x)φ k (x ) () k= where λ k [, ) (with λ k λ k+ and lim k λ k = ), and φ k (x) : D R are the eigenvalues and eigenfunctions of the covariance operator. A direct consequence of this result is the representation of a random field in terms of a series expansion. Hence, a second-order random field H(x, ω) can be approximated by Ĥ(x, ω) using the Karhunen-Loève (K-L) expansion after truncating the series at the M-th term as [37] H(x, ω) Ĥ(x, ω) := µ H(x) + M λk φ k (x)θ k (ω), () here, θ k (ω) : Ω R is a set of mutually uncorrelated random variables with zero mean and unit variance (i.e. E[θ k (ω)] = and E[θ k (ω)θ l (ω)] = δ kl ). If the random field is Gaussian, the random variables θ k (ω) are independent standard Gaussian. In any other case, the joint distribution of θ k (ω) is difficult to obtain. However, a class of non-gaussian random fields, the so-called translation fields [38], can be still represented with the K-L expansion through a suitable isoprobabilistic transformation of an underlying Gaussian field. Notice that Eq. () separates the random field as H(x, ω) = µ H (x) + H σ (x, ω), that is, into the mean path of the field and a zero-mean (centered) random field that incorporates the covariance information. The set of eigenpairs {λ k, φ k } is computed through the solution of a homogeneous Fredholm integral equation of the second kind [37], C HH (x, x )φ k (x )dx = λ k φ k (x), (3) D whose analytical solution exist only for specific cases of autocovariance functions [37]. In general, this equation is estimated numerically using projection methods (e.g., collocation, Galerkin) [39], which express the eigenfunctions as a linear combination of complete basis functions. Other approaches include, degenerate kernel methods [4], which approximate the target kernel by a separable kernel given by the sum of a finite number of products of functions; Nyström methods [4], which solve the integral equation using Gaussian quadrature rules; circulant embedding [4], which uses fast Fourier transform to diagonalize a nested-blockcirculant-matrix extension of a nested-block-toeplitz covariance matrix, this construction provides a finite expansion of the field in terms of a deterministic basis. 4 k=

5 .3. Whittle-Matérn covariance kernels Covariance kernels for random field modeling are empirical models used to define the particular correlation characteristics of a random field. A flexible class of isotropic Hilbert-Schmidt kernels used for the definition of random fields are the so-called Whittle-Matérn functions, which are defined as [43] ν C ν (d) = σ H Γ(ν) ( νd l c )ν ( ) νd K ν l c where, d = x x, Γ( ) is the gamma function, K ν ( ) is the modified Bessel function of the second kind, l c is a range parameter (correlation length), and ν > is a smoothing parameter. The parameters of the Matérn model l c and ν can be fitted based on experimental measurements. The value of ν determines the smoothness of the random field; this is important when the field is used to make predictions. However, ν is typically fixed since it is poorly identified in practical applications [44]. From Eq. (4), the special case ν = / and limiting case ν are of particular interest, ) C / (d) = σ H exp ( dlc and ) C (d) = σ H exp ( d l c which correspond to the non-differentiable exponential and infinite-differentiable squared exponential (also called as Gaussian) autocovariance kernels, respectively..4. Error measures for random field discretization For the K-L expansion, the number of terms to be included in the series is closely related to the magnitudes of the eigenvalues of the covariance operator, which in turn strongly depend on the correlation length of the field. Specifically, the quality of the discretization is quantified with respect to the level of accuracy in the estimation of the exact mean (bias) and variance (variability) functions of the random field. Local point-wise error measures for the mean and variance can be defined as the relative difference between the exact and approximated random fields: E[H(x, ω)] E[Ĥ(x, ω)] V[H(x, ω)] V[Ĥ(x, ω)] ɛ µ (x) = ɛ E[H(x, ω)] σ (x) = (6) V[H(x, ω)] here, ɛ µ and ɛ σ are the relative errors in the mean and variance, respectively. Global error measures can also be applied to quantify the overall quality of the random field representation. These measures are defined for the mean and the variance, as their average values over the domain of definition D of the random field [39], ɛ µ = ɛ µ (x)dx and ɛ σ = ɛ σ (x)dx (7) D D D D where D = D dx. These error measures allow one to evaluate the quality of the prior and posterior random field estimates. For the prior random field, the mean function can be represented exactly with the K-L expansion, i.e., ɛ µ (x) = ; and the variance function is approximated as V[Ĥ(x, ω)] = M k= λ kφ k (x), which yields, ɛ σ (x) = σ M H k= λ kφ. k (x) For the posterior random field, these expressions are no longer valid and estimation based on posterior statistics is necessary. When data is available, other error measures are of relevance, such as the relative misfit of the approximated forward model and the observed data. In this study, only global error measures that average local point-errors over the domain are considered to facilitate a comparison between the prior and posterior random field approximation. (4) (5) 5

6 3. Bayesian inference for inverse problems 3.. Inference of random fields Consider the forward problem y = G(H(x, ω)), where G : L (D) L (D) is the forward response model expressing the relationship between the model output and the spatially varying parameters. The forward model G is generally masked by an observation operator, such that the model output is computed at specific locations x R m, with m denoting the number of observations. Since the spatially varying parameters are modeled by random fields, they are parametrized in the physical and stochastic space by x and ω. As a result, G typically implies the solution of a stochastic partial differential equation (SPDE) projected onto the space of observations Y R m. The observed data frequently contain noise. In classical inverse problems, this noise is usually modeled as additive and mutually independent of the uncertain parameters; this assumption yields, ỹ = G(H(x, ω)) + η (8) where ỹ R m is the vector of observations, and the noise random vector η R m follows a Gaussian distribution with mean zero and non-singular covariance matrix Σ ηη R m m. Other noise models exist in the literature, e.g., multiplicative errors, convolution of measurement and model error distributions, among others (see [, 45]). The inverse problem in Eq. (8) is difficult to solve since it is generally ill-posed. This is mainly because the outcome space of the random field is infinite dimensional, while the dimension of the data space is finite. For this reason, the framework of Bayesian statistical theory is employed. The advantage of Bayesian inference for inverse problems lies in the fact that the prior information represents a mechanism of regularization [, 5]. Furthermore, Bayesian updating facilitates the assessment of the impact of the uncertain parameters on the solution of the forward problem, on the prediction of a given quantity of interest, and on the estimation of rare event probabilities. Bayes theorem in infinite dimensions is interpreted as the Radon-Nikodym derivative of the posterior probability measure with respect to the prior probability measure [46]. In practice, the random field H(x, ω) is substituted by its discrete representation Ĥ(x, ω) in terms of a finite number of random variables. In most cases, the discretized random field lies in a high dimensional parameter space. Particularly, the K-L expansion can be used to reduce the dimensionality and parametrize the random field. Consider the square-integrable random vector θ(ω) Θ R M resulting from a truncated K-L series expansion (Eq. ()). Observe that since the parameter θ(ω) characterizes the randomness of the field, performing inference on the random field Ĥ(x, ω) is analogous to inferring directly the random vector θ(ω); we henceforth denote the approximated random field as Ĥ(x, θ), and consequently, the forward response operator is now a map G : Θ Y. In Bayesian inverse problems, it is assumed that the initial knowledge about the parameters before considering any measurement can be summarized by a probability density function (PDF) θ f (θ), called the prior distribution. The updated belief about θ after including the data ỹ represents the solution of the inverse problem, that is, the posterior distribution f(θ ỹ). Following Bayes theorem this conditional PDF is [], f(θ ỹ) = f(θ) L(θ ỹ) (9) Z(ỹ) where, the likelihood function L(θ ỹ) = f(ỹ θ) provides a link between model and data, and the model evidence Z(ỹ) = L(θ ỹ) f(θ) dθ is a normalization constant. The value of Z(ỹ) gives information about Θ the plausibility of the assumed model and it is used in the context of model selection and averaging [47]. As a result of the K-L representation, Gaussian or translation random fields are implicitly endowed with a multivariate Gaussian prior distribution (also known as Gaussian process prior) whose second-order moment properties need to be defined; that is, the prior mean and autocovariance functions. Even for homogeneous Gaussian random fields controlled only by the correlation length and the variance, the choice of the prior distribution remains a challenge. This is due to the fact that the prior information about the autocovariance kernel is usually vague. Additionally, the observed data is not often sufficient to clearly identify the correlation structure. Therefore, the assumed prior probabilistic model has a large influence on the posterior random 6

7 field solution and on the rare event updating. A hierarchical Bayesian framework simplifies the prior modeling of the target random field by the inclusion of hyperparameters, for the definition of the autocovariance kernel, such as the variance and correlation length of the field [5, 6, ]. This approach is not considered here, since the target is to directly study the implications of different parameter choices on the posterior solution. Remark (). Since the random vector θ R M of K-L expansion coefficients is standard Gaussian distributed, the prior density is fixed as f(θ) = N (, I), with I R M M the identity matrix. For a given modeling setting, the prior information about the second-order properties of the field enters directly in the definition of the likelihood function. 3.. The BUS framework In most cases, the solution of Bayesian inverse problems requires the application of numerical methods. MCMC-based algorithms are typically employed to generate samples from the target posterior distribution. A disadvantage of standard MCMC samplers is that they often require a large computational cost since the underlying PDE model needs to be solved many times to achieve convergence. Moreover, the convergence rate of such methods typically deteriorates when the dimension of the parameter space increases. Specialized sampling-based algorithms [4, 7, 3] alleviate some of the issues of standard MCMC. A recently proposed framework to Bayesian inference is BUS (Bayesian Updating with Structural reliability methods) [5]. The BUS approach is based on the classical rejection sampling algorithm and expresses Bayesian inference as an equivalent rare event simulation problem. Let π(θ) be an unnormalized version of the posterior distribution in Eq. (9), i.e., π(θ) = f(θ)l(θ ỹ). In BUS, the proposal distribution of rejection sampling q(θ) is set to be equal to the prior distribution f(θ) (provided that f(θ) has heavier tails than π(θ)). The acceptance probability of rejection sampling becomes α = π(θ) f(θ) L(θ ỹ) = = c L(θ ỹ) () ĉ q(θ) ĉ f(θ) where c = /ĉ is a positive constant satisfying c L(θ ỹ). Consequently, a proposed sample θ f(θ) is accepted if υ c L(θ ỹ), otherwise is rejected. The auxiliary parameter υ Υ R [,] is a standard uniform random variable (υ U[, ]) that is included into the space of random variables (Θ = [Θ, Υ]). From this construction we can define the space, H = {[θ, υ] Θ : υ c L(θ ỹ)}. () In reliability analysis, the space H can be seen as a failure domain with associated limit state function h(θ, υ) = υ c L(θ ỹ). We refer to this space as the observation domain, since the samples drawn from the prior distribution will follow the posterior distribution if and only if they belong to H. Observe also that if the samples belong to H, they describe a failure event that represents a rare event estimation problem. This connection allows us to use existing methods from rare event simulation to perform Bayesian inference. For instance, the classical rejection sampling algorithm corresponds to employing standard Monte Carlo simulation in BUS. In order to perform Bayesian inference efficiently, BUS is typically combined with the subset simulation (SuS) method [48]. The main advantage of SuS lies in its ability to transform a rare event estimation problem into a sequence of problems involving more frequent events. Moreover, the performance of the method does not deteriorate with increasing dimension of the uncertain parameter space. When using SuS in combination with BUS, the resulting posterior samples are unweighted but correlated (due to the adaptive choice of the intermediate levels and MCMC steps) [8]. The implementation of BUS requires the choice of the constant c = /ĉ. In BUS, the parameter ĉ is optimally chosen as the maximum of the likelihood function. However, in most cases this value is not known in advance. Therefore, an adaptive version of BUS in which the constant c is not required beforehand and is computed sequentially as the simulation evolves is proposed in [49, 8]. Additionally, a method that does not require the scaling constant ĉ to be equal to the maximum likelihood and incorporates a re-sampling step to draw samples from the posterior is introduced in [5]. 7

8 3.3. Updating of rare event probabilities In the context of reliability analysis and rare event estimation, the performance of the system under consideration can be described by a limit state function (LSF) g : Θ R. The failure hypersurface defined by g(θ) = splits the space of uncertain variables into the safe domain S = {θ Θ : g(θ) > } and the failure domain F = {θ Θ : g(θ) < }. The probability of occurrence of F Θ, referred to as the probability of failure, is defined by P[F] = F [θ]f(θ)dθ () Θ where f(θ) is the prior PDF of the model parameters and [ ] denotes the indicator function, which takes the values F [θ] = when θ F, and F [θ] = otherwise. A special challenge involves the analysis of rare events, that is, when Eq. () represents the solution of a potentially high dimensional integral for which P[F] is very small. The information provided by measured or observed data can be incorporated into the analysis to improve the probability of failure estimate. This implies the computation of failure probabilities conditional on the observations ỹ. The updated probability of failure P[F ỹ] can be estimated using the posterior PDF of the model parameters as P[F ỹ] = F [θ]f(θ ỹ) dθ = F [θ]f(θ) L(θ ỹ) dθ. (3) Θ Z(ỹ) Θ Advanced simulation methods can be employed for the estimation of the integrals in Eqs. () and (3), e.g. sequential importance sampling [5], cross-entropy method [3], moving particles [5], or subset simulation [48]. However, the estimation of the integral (3) is a more challenging task than (), since it requires sampling from the tails of the posterior distribution. Several strategies are proposed in the literature to estimate this posterior failure probability, e.g., [53, 54, 55, 56]. In the context of BUS, the reliability updating problem is approached as follows. The posterior distribution of the parameter vector θ is computed by conditioning the joint distribution of [θ, υ] to the observation space H and marginalizing over υ, i.e., f(θ ỹ) = c θ H [θ]f(θ)dυ, with the normalizing constant c θ = Θ H [θ]f(θ)dυdθ. Hence, the posterior failure probability can be expressed in terms of two rare event estimation tasks [9], P[F ỹ] = Θ F [θ] H [θ, υ]f(θ) dυ dθ Θ = H [θ, υ]f(θ) dυ dθ f(θ) dυ dθ h(θ,υ) f(θ) dυ dθ = g(θ) h(θ,υ) P[g(θ) h(θ, υ) ], P[h(θ, υ) ] (4) which implies the computation of a system reliability problem for the numerator and a component reliability problem for the denominator. In the general case, both reliability estimation tasks in Eq. (4) need to be solved. However, if the Bayesian inverse problem has been computed already with the BUS approach (or any other method), samples from the posterior distribution are available and they can be used to accelerate the estimation of the posterior probability of failure P[F ỹ]. This is because the posterior samples belong to the observation domain H associated to the LSF h(θ, υ). Hence, the estimation of the failure probability corresponding to the numerator is only required, i.e., P[g(θ) h(θ, υ) ]. This represents a conditional reliability problem that can be computed by any advanced rare event estimation algorithm. Especially, if we employ the SuS method combined with the BUS approach for solving the Bayesian inverse problem, some minor modifications of the original algorithm are required: (i) limit state function of the observation domain; the LSF h(θ, υ) is fixed at the beginning of the simulation for a suitable constant c (if adaptive BUS is used [8], the posterior solution provides the constant c at no additional cost); (ii) initial Monte Carlo samples; the samples θ at the first simulation level are the estimated posterior samples; and (iii) acceptance/rejection MCMC criterion; in the MCMC algorithm used within SuS, the candidate sample needs to satisfy the constraint h(θ, υ) (in addition to the condition g(θ) ). This guarantees that the proposed samples are not only included in the failure domain F, but also in the observation domain H. Details of this approach are given in [9]. 8

9 4. Numerical investigations The focus of this paper is the analysis of the implications of different parameter choices for the prior random field modeling on the solution of the Bayesian inverse problem. We aim at showing those effects by carrying out a parametric study on two examples, one for which is possible to compute all the posterior quantities analytically, and a second one that requires the use of sampling-based approaches to estimate the posterior quantities. 4.. D cantilever beam: analytical solution In the following, an example for which it is possible to derive analytically the posterior random field is proposed. This enables a precise evaluation of the influence of the K-L discretization on the posterior solution Model description We consider the second example in [5], the updating of the spatially variable flexibility F (x) of a cantilever beam. The beam has length L = 5 m (i.e., the domain is the interval D = [, L]) and is subjected to a deterministic point load P = kn at the free end as shown in Figure. The prior flexibility is described by a homogeneous Gaussian random field F (x, ω). The two Matérn kernels in Eq. (5) are considered as autocovariance functions C F F (x, x ) for the flexibility. The mean of the field is µ F = 4 kn m and the standard deviation is σ F = kn m. A parameter study on the correlation length l c is performed. Figure : Cantilever beam: true values and two sets of deflection observations. From the Euler-Bernoulli equation [57], the bending moment in the beam M(x) can be computed from the differential equation, M(x) = E(x)I d w(x) dx = M(x)F (x) = d w(x) dx, (5) where, w(x) is the deflection, E(x) is the elastic modulus, I is the moment of inertia, and F (x) = (E(x)I) is the flexibility of the beam (the inverse of the bending stiffness). Integrating twice Eq. (5) and noting that 9

10 the bending moment of a cantilever beam can be calculated as M(x) = (L x)p, the forward deflection response can be obtained by solving the following equation, w(x, F (x)) = P x s (L t)f (t) dtds. (6) The observation noise is modeled as additive and mutually independent from the uncertain flexibility. The noise is described by a joint Gaussian PDF with mean zero and covariance matrix Σ ηη. The noise covariance is computed by assuming that the measurements are correlated with an exponential kernel, with standard deviation σ η = 3 and correlation length l η = m. This results in the following likelihood function, L(F (x) ỹ) = ( (π)m det(σ ηη ) exp ) [ỹ w( x, F (x))]t Σ ηη [ỹ w( x, F (x))], (7) here, F (x) is a realization of the flexibility random field, and ỹ is a set of m deflection observations measured at equally spaced points x of the domain (Figure ). The observations are generated by simulation assuming a true (but in real applications unknown) deflection of the beam. To avoid a so-called inverse crime [], the underlying true flexibility is generated at a much finer discretization than the one used during the inverse problem solution. Moreover, the full autocovariance information via Cholesky decomposition is used (assuming an exponential kernel with l c = m and applying the same noise used in the likelihood) Analytical solution for prior and posterior The mean and autocovariance functions of the prior deflection can be evaluated using the prior information about the flexibility F (x) and the forward operator. Since F (x) is Gaussian and w(x, F (x)) is a linear function of F (x), the prior distribution of the deflection is also Gaussian. Therefore, an expression for the mean of w(x) can be obtained using µ F in Eq. (6), µ w (x) = P x s (L t)µ F (t) dtds = P µ F 6 x (3L x) (8) and similarly, the autocovariance function of w(x) can be deduced using C F F (x, x ), x C ww (x, x ) = P x s s (L t)(l t )C F F (t, t ) dt dt ds ds, (9) which leads to different expressions depending on the choice of C F F (x, x ). The mean, standard deviations and autocorrelation functions of the prior deflection random fields are shown in Figure. The autocorrelation functions for the prior flexibility are also plotted (the mean and standard deviations are not included since they are constant). Closed-form expressions of the posterior random fields of the flexibility and deflection can also be derived in this example. Since the prior and likelihood are Gaussian, the posterior distribution is also Gaussian [47]. We introduce the random vector F = [F, ỹ], which is comprised of the random vectors F = F (x, ω) R n and ỹ R m, with F representing the random field discretized at spatial locations x = [x,..., x n ]. The mean vector and covariance matrix of F can be partitioned accordingly in terms of individual and crossed components [58]: µ F = [ µf µỹ ] [ ΣF Σ F F = F Σ F ỹ Σ T F ỹ Σỹỹ ]. () The n-th order fi-di posterior distribution of the flexibility random field f(f ỹ) can be obtained analytically from direct application of the Bayes theorem (see e.g., [, 3.4]); this conditional PDF is given by f(f ỹ) = ( (π)n det(σ F F ỹ ) exp ) [F µ F ỹ] T Σ F F ỹ [F µ F ỹ ]. ()

11 µw(x) [m] Mean deflection.8 σw(x) [m].3.. Std. of deflection Exponential kernel Sq. Exponential kernel. x x R F F (x, x ): Exponential C F F x R ww (x, x ): Exponential C F F x R F F (x, x ): sq. Exponential C F F 5 x 4 3 x R ww (x, x ): sq. Exponential C F F 5 x 4 3 x Figure : Mean, standard deviation, and autocorrelation function of the prior flexibility and deflection random fields. Using the exponential and squared exponential kernels for C F F (with l c =.5). An analogous expression can be obtained for the posterior distribution of the deflection f(w ỹ). Those multivariate distributions are characterized by the conditional mean vectors µ F ỹ, µ w ỹ and the conditional autocovariance matrices Σ F F ỹ, Σ ww ỹ, which are respectively given by [58] µ F ỹ = µ F + Σ F ỹ Σ ỹỹ (ỹ µỹ) Σ F F ỹ = Σ F F Σ F ỹ Σ ỹỹ Σ T F ỹ µ w ỹ = µ w + Σ wỹ Σ ỹỹ (ỹ µỹ) Σ ww ỹ = Σ ww Σ wỹ Σ ỹỹ Σ T wỹ (a) (b) these quantities are known from the prior random fields or can be computed analytically. The mean, standard deviation and autocorrelation functions of the posterior flexibility and deflection random fields are plotted in Figure Approximated solution for prior and posterior When dealing with random fields, the Bayesian inference process involves analysis in high dimensional spaces. In such cases, the uncertain function is typically represented by a suitable parametrization. The K-L expansion is employed here to discretize the prior flexibility random field. We express the forward operator in Eq. (6) in terms of the K-L expansion for the flexibility field as, ŵ(x, θ) = P = P x s x s = µ w (x) [ ] M (L t) µ F (t) + λk φ k (t)θ k dtds k= (L t)µ F (t) dtds P x s (L t) M Φ k (x) λ k θ k where, Φ k (x) = P k= M λk φ k (t)θ k dtds k= x s (3a) (3b) (L t)φ k (t) dtds. (3c)

12 µ w ỹ (x) [m] µ F ỹ (x) [/(kn m )] Mean flexibility Exponential kernel Sq. Exponential kernel R F F ỹ (x, x ): from exponential C F F 5 x x Mean deflection Exponential kernel Sq. Exponential kernel R ww ỹ (x, x ): from exponential C F F 5 x 4 3 x σ F ỹ (x) [/(kn m )] σ w ỹ (x) [m] Std. flexibility Exponential kernel Sq. Exponential kernel. x R F F ỹ (x, x ): from sq. Exp. C F F x 3 Std. deflection Exponential kernel Sq. Exponential kernel x R ww ỹ (x, x ): from sq. Exp. C F F x Figure 3: Mean, standard deviation, and autocorrelation function of the posterior flexibility (rows -) and deflection (rows 3-4) random fields (from Eqs. (a),(b)). Using the exponential and squared exponential kernels for C F F (with l c =.5 and m = ).

13 Alternatively, for a given discretization of the domain x = [x,..., x n ], Eq. (3c) can also be written in matrix form as ŵ = µ w ΦΛθ = µ w Aθ (4) where, µ w R n is the prior mean deflection vector computed from Eq. (8), Φ R n M is a matrix obtained by evaluating Φ k (x) in Eq. (3c) (that is, Φ (j,k) = Φ k (x j ), for j =,..., n), and Λ = diag( λ) R M M is a diagonal matrix with the square root of the eigenvalues. Observe that Φ k (x) can be evaluated analytically given that the eigenpairs of the target autocovariance kernel are available. The approximated posterior random field can be subsequently computed following the same procedure of 4.., but in this case for the standard Gaussian random vector θ. Hence, we assume a Gaussian random vector composed by θ and ỹ. The posterior distribution can be calculated as the conditional PDF of θ given ỹ, as in Eq. (). This posterior random field can be represented as a multivariate Gaussian distribution with conditional mean vector µ θ ỹ and conditional covariance matrix Σ θθ ỹ given by, µ θ ỹ = µ θ + Σ θỹ Σ ỹỹ (ỹ µỹ) and Σ θθ ỹ = Σ θθ Σ θỹ Σ ỹỹ Σ T θỹ; (5) here, µ θ = E[θ] =, Σ θθ = I (where I R M M is the identity matrix), and the remaining covariance terms can be derived analytically from the approximated model in Eq. (4). Therefore, the mean vector and covariance matrix of the posterior distribution of θ can be computed respectively as µ θ ỹ = A T ( AA T + Σ ηη ) (ỹ µỹ) and Σ θθ ỹ = I A T ( AA T + Σ ηη ) A. (6) The posterior random fields of the flexibility and deflection after using the K-L approximation can be obtained from the posterior of θ. In this case, both random fields are also represented by multivariate Gaussian distributions described by the following approximated mean and autocovariance functions, µ F ỹ (x) = µ F (x) + µ w ỹ (x) = µ w (x) + P M k= λk φ k (x)µ (k) θ ỹ M k= λk Φ k (x)µ (k) θ ỹ Ĉ F F ỹ (x, x ) = M Ĉ ww ỹ (x, x ) = P M k= l= M k= l= λk λ l φ k (x)φ l (x )Σ (k,l) θθ ỹ M λk λ l Φ k (x)φ l (x )Σ (k,l) θθ ỹ (7a) (7b) where the superscripts in vector µ (k) θ ỹ and matrix Σ (k,l) θθ ỹ refer to element indexing Analytical solution for the model evidence Consider a finite collection of possible models {M, M,..., M M,..., M Mmax }, where M [, M max ] is a model indicator index. Each particular model M M has an associated vector of uncertain parameters θ R M, where the dimension M vary between different models. In the context of the K-L discretization, these models correspond to the dimension of the stochastic space discretized by the truncated series, i.e., the number of terms in the K-L expansion. An analytical expression for the model evidence can be derived for this example. The process involves a marginalization of the likelihood function over the parameters (integration); alternatively it can be computed as the product of prior with likelihood divided by the posterior. Following the latter approach, the natural logarithm of the model evidence is given by, ln Z (ỹ M) = ln f (θ M) + ln L (θ ỹ, M) ln f (θ ỹ, M) (8) where the log-prior, log-likelihood and log-posterior conditional on the dimension are ln f (θ M) = M ln(π) θt θ ln L (θ ỹ, M) = m ln(π) ln (det(σ ηη)) ( ) [ỹ (µ w Aθ)] T Σ ηη [ỹ (µ w Aθ)] (9a) (9b) 3

14 ln f (θ ỹ, M) = M ln(π) ln (det(σ θθ ỹ)) ( ) [θ µ θ ỹ ] T Σ θθ ỹ [θ µ θ ỹ ]. (9c) After substituting Eqs. (9a) (9c) into Eq. (8) and some algebra (a derivation is given in the Appendix), the analytical expression for the model evidence is found to be, ln Z (ỹ M) = ( ( ) ) det(σηη ) m ln(π) + ln + (ỹ µ w ) T Σ ηη (ỹ µ w ) µ T det(σ θθ ỹ ) θ ỹσ θθ ỹµ θ ỹ. (3) The model evidence is employed to assess whether a more complex model is required for the representation of the measurement data. In this case, the dimension with the highest value of the model evidence is regarded as the best model, meaning that it gives an optimum balance between predictability and quality of the data fit [59]. 4.. Parametric studies We are now able to evaluate the influence of the K-L expansion on the prior as well as on the posterior flexibility and deflection random fields. The following settings are considered: the number of terms in the K-L expansion is chosen from M = {5,, }; the correlation length of the prior flexibility from l c = {.5,.5, 4.5} m; two different sets of measurements are assumed with m = {, 4} points (see Figure ); and for each of these settings, the two autocovariance functions in Eq. (5) are used to represent the prior flexibility, namely, the exponential and the squared exponential kernels (the standard deviation is fixed and it is specified in 4..). Posterior approximation: the analytical posterior random field expressions (Eqs. (a) and (b)) and the associated K-L approximations (Eqs. (7a) and (7b)), allow us to assess the influence of different prior random field assumptions on the posterior solution. In the following, 95% credible intervals (CI) are represented as the region between the.5 and.975 quantiles of the posterior. The approximation of the posterior flexibility random field using an exponential kernel as the underlying prior flexibility covariance is illustrated in Figure 4. We show the 95% CI of the analytical solution (shaded area) and the K-L approximations as a function of the number of terms in the expansion for increasing correlation length. The full set of K-L representations are contained inside the analytical CI, and they convergence to this solution as the number of terms increases. Thus, the K-L expansion under-represents the true variability in the posterior flexibility. For small correlation lengths, the posterior random field is more difficult to capture since one is learning a random field that has larger variability. Nevertheless, already M = terms in the expansion are enough to have a good approximation of the flexibility random field for this example. Comparing the results from both sets of measurements, it can be seen that the number of data points controls the width of the CI bounds. The width of those bounds narrows when more information is available. Furthermore, the flexibility random field is no longer weakly homogeneous since the posterior mean vary through the domain (the plots are omitted). Figure 5 presents the approximation of the posterior deflection random field with underlying exponential autocovariance for the prior flexibility. In order to illustrate the distinction between solutions, the 95% CIs of a differential deflection are shown. They are computed as the difference between the prior mean of the random field and the 95% posterior CIs. In contrast to the posterior flexibility, the correlation length does not have a large influence in the K-L approximation of the posterior deflection. For all cases, the K-L expansion represents the posterior deflection almost as exact as the analytical case, even when using a small number of terms in the expansion. The reason for this is that the posterior deflection is computed by averaging the K-L expansion of the flexibility random field over the domain (see Eq. (3a)). As a result, the influence of the higher K-L eigenfunctions becomes negligible, and mainly the first modes have a contribution to the random field representation. 4

15 .5.5. m = m = 4 Flexibility [/(kn m )].75 Flexibility [/(kn m )] l c = l c = Flexibility [/(kn m )] Flexibility [/(kn m )].8 4 l c = l c = Flexibility [/(kn m )] Flexibility [/(kn m )].8 4 l c = l c = Figure 4: Posterior flexibility (using an exponential kernel for the prior): 95% CI for different terms in the K-L expansion, number of measurements (rows), and correlation lengths of the prior flexibility (columns). The shaded area corresponds to the analytical CI (Eq. (a)).. m = m = 4 Differential deflection [m].75 Differential deflection [m].5. l c = l c = Differential deflection [m] Differential deflection [m]. l c = l c = Differential deflection [m] Differential deflection [m]. l c = l c = Figure 5: Differential posterior deflection (using an exponential kernel for the prior): 95% CI for different terms in the K-L expansion, number of measurements (rows), and correlation lengths of the prior flexibility (columns). The shaded area corresponds to the analytical CI (Eq. (b)). Finally, the approximation of the posterior flexibility and deflection random fields assuming a squared exponential autocovariance function for the prior flexibility is shown in Figure 6. Here, only the results for the set of measurements with m = points are shown. Even for small correlation lengths and K-L expansions with at least M = terms, the difference between the posterior flexibility random field and 5

16 .5 the analytical solution is negligible. As the correlation length increases, the inverse problem solution can be computed accurately with even a smaller number of terms in the expansion (M = 5). We point out that the eigenvalue decay is stronger for the squared exponential kernel as compared to the exponential, which yields to a lower number of terms in the K-L representation. It is also reminded that the true underlying flexibility is generated assuming an exponential kernel. This is reflected in the inverse problem solution, since sample paths generated from a random field with a squared exponential covariance smooth out faster. The resulting posterior approximation is not able to capture the true underlying field with high confidence at all spatial points of the domain when the correlation length is large..5. m = m = Flexibility [/(kn m )].75 Differential deflection [m] l c = l c = Flexibility [/(kn m )] Differential deflection [m].8 4 l c = l c = Flexibility [/(kn m )] Differential deflection [m].8 4 l c = l c = Figure 6: Posterior flexibility and differential posterior deflection (using a squared exponential kernel for the prior): 95% CI for different terms in the K-L expansion and correlation lengths of the prior flexibility (columns). The shaded area corresponds to the analytical CI. Model comparison: the analytical expression of the model evidence in Eq. (3) is now used to perform model comparison. Figure 7 shows the model evidence for different K-L expansion terms, where the exponential kernel is used as the autocovariance of the prior flexibility. The best models are highlighted by a red solid line. Notice that different choices in the parameters of the prior random field lead to different optimal truncation orders in the K-L expansion. As also evident in the posterior approximation results, random fields described by small correlation lengths require a larger number of terms in the expansion for their discretization. In particular, the information gained by the inclusion of additional terms is negligible once the best dimension is achieved, and is lower than the penalty for the increased model complexity. Furthermore, more measurement data leads to a larger model evidence, which requires more K-L parameters for an optimal random field representation. The model evidence using a squared exponential autocovariance kernel for the prior flexibility is also evaluated (the plots are omitted). The results based on this assumption yield smaller model evidence values as compared to the exponential case. This agrees with the fact that the underlying true autocovariance is of the exponential type. Since the solution of the inverse problem is typically affected by changes in data, different measurements will yield different model evidence factors. In order to assess the overall contribution of the number of terms in the K-L discretization, it is relevant to compute the model evidence without considering the measurement data. Therefore, the model evidence can be marginalized with respect to the observational data as, Eỹ [Z (ỹ M)] = Z (ỹ M) f data (ỹ)dỹ; (3) ỹ 6

Random Fields in Bayesian Inference: Effects of the Random Field Discretization

Random Fields in Bayesian Inference: Effects of the Random Field Discretization Random Fields in Bayesian Inference: Effects of the Random Field Discretization Felipe Uribe a, Iason Papaioannou a, Wolfgang Betz a, Elisabeth Ullmann b, Daniel Straub a a Engineering Risk Analysis Group,

More information

Stochastic Spectral Approaches to Bayesian Inference

Stochastic Spectral Approaches to Bayesian Inference Stochastic Spectral Approaches to Bayesian Inference Prof. Nathan L. Gibson Department of Mathematics Applied Mathematics and Computation Seminar March 4, 2011 Prof. Gibson (OSU) Spectral Approaches to

More information

The Bayesian approach to inverse problems

The Bayesian approach to inverse problems The Bayesian approach to inverse problems Youssef Marzouk Department of Aeronautics and Astronautics Center for Computational Engineering Massachusetts Institute of Technology ymarz@mit.edu, http://uqgroup.mit.edu

More information

Dynamic System Identification using HDMR-Bayesian Technique

Dynamic System Identification using HDMR-Bayesian Technique Dynamic System Identification using HDMR-Bayesian Technique *Shereena O A 1) and Dr. B N Rao 2) 1), 2) Department of Civil Engineering, IIT Madras, Chennai 600036, Tamil Nadu, India 1) ce14d020@smail.iitm.ac.in

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

Multilevel Sequential 2 Monte Carlo for Bayesian Inverse Problems

Multilevel Sequential 2 Monte Carlo for Bayesian Inverse Problems Jonas Latz 1 Multilevel Sequential 2 Monte Carlo for Bayesian Inverse Problems Jonas Latz Technische Universität München Fakultät für Mathematik Lehrstuhl für Numerische Mathematik jonas.latz@tum.de November

More information

Characterization of heterogeneous hydraulic conductivity field via Karhunen-Loève expansions and a measure-theoretic computational method

Characterization of heterogeneous hydraulic conductivity field via Karhunen-Loève expansions and a measure-theoretic computational method Characterization of heterogeneous hydraulic conductivity field via Karhunen-Loève expansions and a measure-theoretic computational method Jiachuan He University of Texas at Austin April 15, 2016 Jiachuan

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

Transitional Markov Chain Monte Carlo: Observations and Improvements

Transitional Markov Chain Monte Carlo: Observations and Improvements Transitional Markov Chain Monte Carlo: Observations and Improvements Wolfgang Betz, Iason Papaioannou, Daniel Straub Engineering Risk Analysis Group, Technische Universität München, 8333 München, Germany

More information

Density Estimation. Seungjin Choi

Density Estimation. Seungjin Choi Density Estimation Seungjin Choi Department of Computer Science and Engineering Pohang University of Science and Technology 77 Cheongam-ro, Nam-gu, Pohang 37673, Korea seungjin@postech.ac.kr http://mlg.postech.ac.kr/

More information

Likelihood-free MCMC

Likelihood-free MCMC Bayesian inference for stable distributions with applications in finance Department of Mathematics University of Leicester September 2, 2011 MSc project final presentation Outline 1 2 3 4 Classical Monte

More information

Collocation based high dimensional model representation for stochastic partial differential equations

Collocation based high dimensional model representation for stochastic partial differential equations Collocation based high dimensional model representation for stochastic partial differential equations S Adhikari 1 1 Swansea University, UK ECCM 2010: IV European Conference on Computational Mechanics,

More information

Nonparametric Bayesian Methods (Gaussian Processes)

Nonparametric Bayesian Methods (Gaussian Processes) [70240413 Statistical Machine Learning, Spring, 2015] Nonparametric Bayesian Methods (Gaussian Processes) Jun Zhu dcszj@mail.tsinghua.edu.cn http://bigml.cs.tsinghua.edu.cn/~jun State Key Lab of Intelligent

More information

Polynomial Chaos and Karhunen-Loeve Expansion

Polynomial Chaos and Karhunen-Loeve Expansion Polynomial Chaos and Karhunen-Loeve Expansion 1) Random Variables Consider a system that is modeled by R = M(x, t, X) where X is a random variable. We are interested in determining the probability of the

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

Statistics: Learning models from data

Statistics: Learning models from data DS-GA 1002 Lecture notes 5 October 19, 2015 Statistics: Learning models from data Learning models from data that are assumed to be generated probabilistically from a certain unknown distribution is a crucial

More information

Computational statistics

Computational statistics Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated

More information

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) Markov Chain Monte Carlo (MCMC Dependent Sampling Suppose we wish to sample from a density π, and we can evaluate π as a function but have no means to directly generate a sample. Rejection sampling can

More information

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference 1 The views expressed in this paper are those of the authors and do not necessarily reflect the views of the Federal Reserve Board of Governors or the Federal Reserve System. Bayesian Estimation of DSGE

More information

State Space Representation of Gaussian Processes

State Space Representation of Gaussian Processes State Space Representation of Gaussian Processes Simo Särkkä Department of Biomedical Engineering and Computational Science (BECS) Aalto University, Espoo, Finland June 12th, 2013 Simo Särkkä (Aalto University)

More information

Bayesian System Identification based on Hierarchical Sparse Bayesian Learning and Gibbs Sampling with Application to Structural Damage Assessment

Bayesian System Identification based on Hierarchical Sparse Bayesian Learning and Gibbs Sampling with Application to Structural Damage Assessment Bayesian System Identification based on Hierarchical Sparse Bayesian Learning and Gibbs Sampling with Application to Structural Damage Assessment Yong Huang a,b, James L. Beck b,* and Hui Li a a Key Lab

More information

A Vector-Space Approach for Stochastic Finite Element Analysis

A Vector-Space Approach for Stochastic Finite Element Analysis A Vector-Space Approach for Stochastic Finite Element Analysis S Adhikari 1 1 Swansea University, UK CST2010: Valencia, Spain Adhikari (Swansea) Vector-Space Approach for SFEM 14-17 September, 2010 1 /

More information

MCMC Sampling for Bayesian Inference using L1-type Priors

MCMC Sampling for Bayesian Inference using L1-type Priors MÜNSTER MCMC Sampling for Bayesian Inference using L1-type Priors (what I do whenever the ill-posedness of EEG/MEG is just not frustrating enough!) AG Imaging Seminar Felix Lucka 26.06.2012 , MÜNSTER Sampling

More information

Frequentist-Bayesian Model Comparisons: A Simple Example

Frequentist-Bayesian Model Comparisons: A Simple Example Frequentist-Bayesian Model Comparisons: A Simple Example Consider data that consist of a signal y with additive noise: Data vector (N elements): D = y + n The additive noise n has zero mean and diagonal

More information

c 2016 Society for Industrial and Applied Mathematics

c 2016 Society for Industrial and Applied Mathematics SIAM J. SCI. COMPUT. Vol. 8, No. 5, pp. A779 A85 c 6 Society for Industrial and Applied Mathematics ACCELERATING MARKOV CHAIN MONTE CARLO WITH ACTIVE SUBSPACES PAUL G. CONSTANTINE, CARSON KENT, AND TAN

More information

If we want to analyze experimental or simulated data we might encounter the following tasks:

If we want to analyze experimental or simulated data we might encounter the following tasks: Chapter 1 Introduction If we want to analyze experimental or simulated data we might encounter the following tasks: Characterization of the source of the signal and diagnosis Studying dependencies Prediction

More information

Least Squares Regression

Least Squares Regression E0 70 Machine Learning Lecture 4 Jan 7, 03) Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are a brief summary of the topics covered in the lecture. They are not a substitute

More information

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix Labor-Supply Shifts and Economic Fluctuations Technical Appendix Yongsung Chang Department of Economics University of Pennsylvania Frank Schorfheide Department of Economics University of Pennsylvania January

More information

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01 Nasser Sadeghkhani a.sadeghkhani@queensu.ca There are two main schools to statistical inference: 1-frequentist

More information

Introduction to Bayesian methods in inverse problems

Introduction to Bayesian methods in inverse problems Introduction to Bayesian methods in inverse problems Ville Kolehmainen 1 1 Department of Applied Physics, University of Eastern Finland, Kuopio, Finland March 4 2013 Manchester, UK. Contents Introduction

More information

Cross entropy-based importance sampling using Gaussian densities revisited

Cross entropy-based importance sampling using Gaussian densities revisited Cross entropy-based importance sampling using Gaussian densities revisited Sebastian Geyer a,, Iason Papaioannou a, Daniel Straub a a Engineering Ris Analysis Group, Technische Universität München, Arcisstraße

More information

Dimensionality reduction and polynomial chaos acceleration of Bayesian inference in inverse problems

Dimensionality reduction and polynomial chaos acceleration of Bayesian inference in inverse problems Dimensionality reduction and polynomial chaos acceleration of Bayesian inference in inverse problems The MIT Faculty has made this article openly available. Please share how this access benefits you. Your

More information

CS 7140: Advanced Machine Learning

CS 7140: Advanced Machine Learning Instructor CS 714: Advanced Machine Learning Lecture 3: Gaussian Processes (17 Jan, 218) Jan-Willem van de Meent (j.vandemeent@northeastern.edu) Scribes Mo Han (han.m@husky.neu.edu) Guillem Reus Muns (reusmuns.g@husky.neu.edu)

More information

Probabilistic Structural Dynamics: Parametric vs. Nonparametric Approach

Probabilistic Structural Dynamics: Parametric vs. Nonparametric Approach Probabilistic Structural Dynamics: Parametric vs. Nonparametric Approach S Adhikari School of Engineering, Swansea University, Swansea, UK Email: S.Adhikari@swansea.ac.uk URL: http://engweb.swan.ac.uk/

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

arxiv: v1 [stat.co] 23 Apr 2018

arxiv: v1 [stat.co] 23 Apr 2018 Bayesian Updating and Uncertainty Quantification using Sequential Tempered MCMC with the Rank-One Modified Metropolis Algorithm Thomas A. Catanach and James L. Beck arxiv:1804.08738v1 [stat.co] 23 Apr

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines

Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Econ 2148, fall 2017 Gaussian process priors, reproducing kernel Hilbert spaces, and Splines Maximilian Kasy Department of Economics, Harvard University 1 / 37 Agenda 6 equivalent representations of the

More information

ECE521 week 3: 23/26 January 2017

ECE521 week 3: 23/26 January 2017 ECE521 week 3: 23/26 January 2017 Outline Probabilistic interpretation of linear regression - Maximum likelihood estimation (MLE) - Maximum a posteriori (MAP) estimation Bias-variance trade-off Linear

More information

Numerical methods for the discretization of random fields by means of the Karhunen Loève expansion

Numerical methods for the discretization of random fields by means of the Karhunen Loève expansion Numerical methods for the discretization of random fields by means of the Karhunen Loève expansion Wolfgang Betz, Iason Papaioannou, Daniel Straub Engineering Risk Analysis Group, Technische Universität

More information

STA414/2104 Statistical Methods for Machine Learning II

STA414/2104 Statistical Methods for Machine Learning II STA414/2104 Statistical Methods for Machine Learning II Murat A. Erdogdu & David Duvenaud Department of Computer Science Department of Statistical Sciences Lecture 3 Slide credits: Russ Salakhutdinov Announcements

More information

CPSC 540: Machine Learning

CPSC 540: Machine Learning CPSC 540: Machine Learning MCMC and Non-Parametric Bayes Mark Schmidt University of British Columbia Winter 2016 Admin I went through project proposals: Some of you got a message on Piazza. No news is

More information

Statistical signal processing

Statistical signal processing Statistical signal processing Short overview of the fundamentals Outline Random variables Random processes Stationarity Ergodicity Spectral analysis Random variable and processes Intuition: A random variable

More information

Gaussian Process Regression

Gaussian Process Regression Gaussian Process Regression 4F1 Pattern Recognition, 21 Carl Edward Rasmussen Department of Engineering, University of Cambridge November 11th - 16th, 21 Rasmussen (Engineering, Cambridge) Gaussian Process

More information

Default Priors and Effcient Posterior Computation in Bayesian

Default Priors and Effcient Posterior Computation in Bayesian Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature

More information

Gaussian Processes for Machine Learning

Gaussian Processes for Machine Learning Gaussian Processes for Machine Learning Carl Edward Rasmussen Max Planck Institute for Biological Cybernetics Tübingen, Germany carl@tuebingen.mpg.de Carlos III, Madrid, May 2006 The actual science of

More information

STA 294: Stochastic Processes & Bayesian Nonparametrics

STA 294: Stochastic Processes & Bayesian Nonparametrics MARKOV CHAINS AND CONVERGENCE CONCEPTS Markov chains are among the simplest stochastic processes, just one step beyond iid sequences of random variables. Traditionally they ve been used in modelling a

More information

Introduction to Machine Learning

Introduction to Machine Learning Introduction to Machine Learning Brown University CSCI 1950-F, Spring 2012 Prof. Erik Sudderth Lecture 25: Markov Chain Monte Carlo (MCMC) Course Review and Advanced Topics Many figures courtesy Kevin

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Statistics! rsalakhu@utstat.toronto.edu! http://www.utstat.utoronto.ca/~rsalakhu/ Sidney Smith Hall, Room 6002 Lecture 3 Linear

More information

Multivariate Distribution Models

Multivariate Distribution Models Multivariate Distribution Models Model Description While the probability distribution for an individual random variable is called marginal, the probability distribution for multiple random variables is

More information

A short introduction to INLA and R-INLA

A short introduction to INLA and R-INLA A short introduction to INLA and R-INLA Integrated Nested Laplace Approximation Thomas Opitz, BioSP, INRA Avignon Workshop: Theory and practice of INLA and SPDE November 7, 2018 2/21 Plan for this talk

More information

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of

covariance function, 174 probability structure of; Yule-Walker equations, 174 Moving average process, fluctuations, 5-6, 175 probability structure of Index* The Statistical Analysis of Time Series by T. W. Anderson Copyright 1971 John Wiley & Sons, Inc. Aliasing, 387-388 Autoregressive {continued) Amplitude, 4, 94 case of first-order, 174 Associated

More information

Point spread function reconstruction from the image of a sharp edge

Point spread function reconstruction from the image of a sharp edge DOE/NV/5946--49 Point spread function reconstruction from the image of a sharp edge John Bardsley, Kevin Joyce, Aaron Luttman The University of Montana National Security Technologies LLC Montana Uncertainty

More information

Review (Probability & Linear Algebra)

Review (Probability & Linear Algebra) Review (Probability & Linear Algebra) CE-725 : Statistical Pattern Recognition Sharif University of Technology Spring 2013 M. Soleymani Outline Axioms of probability theory Conditional probability, Joint

More information

Risk and Reliability Analysis: Theory and Applications: In Honor of Prof. Armen Der Kiureghian. Edited by P.

Risk and Reliability Analysis: Theory and Applications: In Honor of Prof. Armen Der Kiureghian. Edited by P. Appeared in: Risk and Reliability Analysis: Theory and Applications: In Honor of Prof. Armen Der Kiureghian. Edited by P. Gardoni, Springer, 2017 Reliability updating in the presence of spatial variability

More information

Polynomial chaos expansions for structural reliability analysis

Polynomial chaos expansions for structural reliability analysis DEPARTMENT OF CIVIL, ENVIRONMENTAL AND GEOMATIC ENGINEERING CHAIR OF RISK, SAFETY & UNCERTAINTY QUANTIFICATION Polynomial chaos expansions for structural reliability analysis B. Sudret & S. Marelli Incl.

More information

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods

Pattern Recognition and Machine Learning. Bishop Chapter 11: Sampling Methods Pattern Recognition and Machine Learning Chapter 11: Sampling Methods Elise Arnaud Jakob Verbeek May 22, 2008 Outline of the chapter 11.1 Basic Sampling Algorithms 11.2 Markov Chain Monte Carlo 11.3 Gibbs

More information

Polynomial chaos expansions for sensitivity analysis

Polynomial chaos expansions for sensitivity analysis c DEPARTMENT OF CIVIL, ENVIRONMENTAL AND GEOMATIC ENGINEERING CHAIR OF RISK, SAFETY & UNCERTAINTY QUANTIFICATION Polynomial chaos expansions for sensitivity analysis B. Sudret Chair of Risk, Safety & Uncertainty

More information

F denotes cumulative density. denotes probability density function; (.)

F denotes cumulative density. denotes probability density function; (.) BAYESIAN ANALYSIS: FOREWORDS Notation. System means the real thing and a model is an assumed mathematical form for the system.. he probability model class M contains the set of the all admissible models

More information

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA Intro: Course Outline and Brief Intro to Marina Vannucci Rice University, USA PASI-CIMAT 04/28-30/2010 Marina Vannucci

More information

Infinite-State Markov-switching for Dynamic. Volatility Models : Web Appendix

Infinite-State Markov-switching for Dynamic. Volatility Models : Web Appendix Infinite-State Markov-switching for Dynamic Volatility Models : Web Appendix Arnaud Dufays 1 Centre de Recherche en Economie et Statistique March 19, 2014 1 Comparison of the two MS-GARCH approximations

More information

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner

Fundamentals. CS 281A: Statistical Learning Theory. Yangqing Jia. August, Based on tutorial slides by Lester Mackey and Ariel Kleiner Fundamentals CS 281A: Statistical Learning Theory Yangqing Jia Based on tutorial slides by Lester Mackey and Ariel Kleiner August, 2011 Outline 1 Probability 2 Statistics 3 Linear Algebra 4 Optimization

More information

Robust MCMC Sampling with Non-Gaussian and Hierarchical Priors

Robust MCMC Sampling with Non-Gaussian and Hierarchical Priors Division of Engineering & Applied Science Robust MCMC Sampling with Non-Gaussian and Hierarchical Priors IPAM, UCLA, November 14, 2017 Matt Dunlop Victor Chen (Caltech) Omiros Papaspiliopoulos (ICREA,

More information

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics)

Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Brandon C. Kelly (Harvard Smithsonian Center for Astrophysics) Probability quantifies randomness and uncertainty How do I estimate the normalization and logarithmic slope of a X ray continuum, assuming

More information

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop Music and Machine Learning (IFT68 Winter 8) Prof. Douglas Eck, Université de Montréal These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

More information

Least Squares Regression

Least Squares Regression CIS 50: Machine Learning Spring 08: Lecture 4 Least Squares Regression Lecturer: Shivani Agarwal Disclaimer: These notes are designed to be a supplement to the lecture. They may or may not cover all the

More information

Variational Principal Components

Variational Principal Components Variational Principal Components Christopher M. Bishop Microsoft Research 7 J. J. Thomson Avenue, Cambridge, CB3 0FB, U.K. cmbishop@microsoft.com http://research.microsoft.com/ cmbishop In Proceedings

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School

More information

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering

ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering ECE276A: Sensing & Estimation in Robotics Lecture 10: Gaussian Mixture and Particle Filtering Lecturer: Nikolay Atanasov: natanasov@ucsd.edu Teaching Assistants: Siwei Guo: s9guo@eng.ucsd.edu Anwesan Pal:

More information

LECTURE 15 Markov chain Monte Carlo

LECTURE 15 Markov chain Monte Carlo LECTURE 15 Markov chain Monte Carlo There are many settings when posterior computation is a challenge in that one does not have a closed form expression for the posterior distribution. Markov chain Monte

More information

CSC 2541: Bayesian Methods for Machine Learning

CSC 2541: Bayesian Methods for Machine Learning CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 10 Alternatives to Monte Carlo Computation Since about 1990, Markov chain Monte Carlo has been the dominant

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Lecture 3: More on regularization. Bayesian vs maximum likelihood learning

Lecture 3: More on regularization. Bayesian vs maximum likelihood learning Lecture 3: More on regularization. Bayesian vs maximum likelihood learning L2 and L1 regularization for linear estimators A Bayesian interpretation of regularization Bayesian vs maximum likelihood fitting

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

However, reliability analysis is not limited to calculation of the probability of failure.

However, reliability analysis is not limited to calculation of the probability of failure. Probabilistic Analysis probabilistic analysis methods, including the first and second-order reliability methods, Monte Carlo simulation, Importance sampling, Latin Hypercube sampling, and stochastic expansions

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

Bayes Model Selection with Path Sampling: Factor Models

Bayes Model Selection with Path Sampling: Factor Models with Path Sampling: Factor Models Ritabrata Dutta and Jayanta K Ghosh Purdue University 07/02/11 Factor Models in Applications Factor Models in Applications Factor Models Factor Models and Factor analysis

More information

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline.

Dependence. Practitioner Course: Portfolio Optimization. John Dodson. September 10, Dependence. John Dodson. Outline. Practitioner Course: Portfolio Optimization September 10, 2008 Before we define dependence, it is useful to define Random variables X and Y are independent iff For all x, y. In particular, F (X,Y ) (x,

More information

Development of Stochastic Artificial Neural Networks for Hydrological Prediction

Development of Stochastic Artificial Neural Networks for Hydrological Prediction Development of Stochastic Artificial Neural Networks for Hydrological Prediction G. B. Kingston, M. F. Lambert and H. R. Maier Centre for Applied Modelling in Water Engineering, School of Civil and Environmental

More information

Sequential Importance Sampling for Rare Event Estimation with Computer Experiments

Sequential Importance Sampling for Rare Event Estimation with Computer Experiments Sequential Importance Sampling for Rare Event Estimation with Computer Experiments Brian Williams and Rick Picard LA-UR-12-22467 Statistical Sciences Group, Los Alamos National Laboratory Abstract Importance

More information

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling

27 : Distributed Monte Carlo Markov Chain. 1 Recap of MCMC and Naive Parallel Gibbs Sampling 10-708: Probabilistic Graphical Models 10-708, Spring 2014 27 : Distributed Monte Carlo Markov Chain Lecturer: Eric P. Xing Scribes: Pengtao Xie, Khoa Luu In this scribe, we are going to review the Parallel

More information

Variational Methods in Bayesian Deconvolution

Variational Methods in Bayesian Deconvolution PHYSTAT, SLAC, Stanford, California, September 8-, Variational Methods in Bayesian Deconvolution K. Zarb Adami Cavendish Laboratory, University of Cambridge, UK This paper gives an introduction to the

More information

On prediction and density estimation Peter McCullagh University of Chicago December 2004

On prediction and density estimation Peter McCullagh University of Chicago December 2004 On prediction and density estimation Peter McCullagh University of Chicago December 2004 Summary Having observed the initial segment of a random sequence, subsequent values may be predicted by calculating

More information

conditional cdf, conditional pdf, total probability theorem?

conditional cdf, conditional pdf, total probability theorem? 6 Multiple Random Variables 6.0 INTRODUCTION scalar vs. random variable cdf, pdf transformation of a random variable conditional cdf, conditional pdf, total probability theorem expectation of a random

More information

STA 4273H: Sta-s-cal Machine Learning

STA 4273H: Sta-s-cal Machine Learning STA 4273H: Sta-s-cal Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 2 In our

More information

The connection of dropout and Bayesian statistics

The connection of dropout and Bayesian statistics The connection of dropout and Bayesian statistics Interpretation of dropout as approximate Bayesian modelling of NN http://mlg.eng.cam.ac.uk/yarin/thesis/thesis.pdf Dropout Geoffrey Hinton Google, University

More information

Introduction to Probability and Statistics (Continued)

Introduction to Probability and Statistics (Continued) Introduction to Probability and Statistics (Continued) Prof. icholas Zabaras Center for Informatics and Computational Science https://cics.nd.edu/ University of otre Dame otre Dame, Indiana, USA Email:

More information

Dynamic response of structures with uncertain properties

Dynamic response of structures with uncertain properties Dynamic response of structures with uncertain properties S. Adhikari 1 1 Chair of Aerospace Engineering, College of Engineering, Swansea University, Bay Campus, Fabian Way, Swansea, SA1 8EN, UK International

More information

Kernel-based Approximation. Methods using MATLAB. Gregory Fasshauer. Interdisciplinary Mathematical Sciences. Michael McCourt.

Kernel-based Approximation. Methods using MATLAB. Gregory Fasshauer. Interdisciplinary Mathematical Sciences. Michael McCourt. SINGAPORE SHANGHAI Vol TAIPEI - Interdisciplinary Mathematical Sciences 19 Kernel-based Approximation Methods using MATLAB Gregory Fasshauer Illinois Institute of Technology, USA Michael McCourt University

More information

Linear Algebra in Computer Vision. Lecture2: Basic Linear Algebra & Probability. Vector. Vector Operations

Linear Algebra in Computer Vision. Lecture2: Basic Linear Algebra & Probability. Vector. Vector Operations Linear Algebra in Computer Vision CSED441:Introduction to Computer Vision (2017F Lecture2: Basic Linear Algebra & Probability Bohyung Han CSE, POSTECH bhhan@postech.ac.kr Mathematics in vector space Linear

More information

Consistent Downscaling of Seismic Inversions to Cornerpoint Flow Models SPE

Consistent Downscaling of Seismic Inversions to Cornerpoint Flow Models SPE Consistent Downscaling of Seismic Inversions to Cornerpoint Flow Models SPE 103268 Subhash Kalla LSU Christopher D. White LSU James S. Gunning CSIRO Michael E. Glinsky BHP-Billiton Contents Method overview

More information

Bayesian Inference. Chapter 4: Regression and Hierarchical Models

Bayesian Inference. Chapter 4: Regression and Hierarchical Models Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative

More information

Organization. I MCMC discussion. I project talks. I Lecture.

Organization. I MCMC discussion. I project talks. I Lecture. Organization I MCMC discussion I project talks. I Lecture. Content I Uncertainty Propagation Overview I Forward-Backward with an Ensemble I Model Reduction (Intro) Uncertainty Propagation in Causal Systems

More information

1 Bayesian Linear Regression (BLR)

1 Bayesian Linear Regression (BLR) Statistical Techniques in Robotics (STR, S15) Lecture#10 (Wednesday, February 11) Lecturer: Byron Boots Gaussian Properties, Bayesian Linear Regression 1 Bayesian Linear Regression (BLR) In linear regression,

More information

A framework for global reliability sensitivity analysis in the presence of multi-uncertainty

A framework for global reliability sensitivity analysis in the presence of multi-uncertainty A framework for global reliability sensitivity analysis in the presence of multi-uncertainty Max Ehre, Iason Papaioannou, Daniel Straub Engineering Risk Analysis Group, Technische Universität München,

More information

MULTISCALE FINITE ELEMENT METHODS FOR STOCHASTIC POROUS MEDIA FLOW EQUATIONS AND APPLICATION TO UNCERTAINTY QUANTIFICATION

MULTISCALE FINITE ELEMENT METHODS FOR STOCHASTIC POROUS MEDIA FLOW EQUATIONS AND APPLICATION TO UNCERTAINTY QUANTIFICATION MULTISCALE FINITE ELEMENT METHODS FOR STOCHASTIC POROUS MEDIA FLOW EQUATIONS AND APPLICATION TO UNCERTAINTY QUANTIFICATION P. DOSTERT, Y. EFENDIEV, AND T.Y. HOU Abstract. In this paper, we study multiscale

More information

Part 1: Expectation Propagation

Part 1: Expectation Propagation Chalmers Machine Learning Summer School Approximate message passing and biomedicine Part 1: Expectation Propagation Tom Heskes Machine Learning Group, Institute for Computing and Information Sciences Radboud

More information

PRECONDITIONING MARKOV CHAIN MONTE CARLO SIMULATIONS USING COARSE-SCALE MODELS

PRECONDITIONING MARKOV CHAIN MONTE CARLO SIMULATIONS USING COARSE-SCALE MODELS PRECONDITIONING MARKOV CHAIN MONTE CARLO SIMULATIONS USING COARSE-SCALE MODELS Y. EFENDIEV, T. HOU, AND W. LUO Abstract. We study the preconditioning of Markov Chain Monte Carlo (MCMC) methods using coarse-scale

More information

Bayesian Inference and MCMC

Bayesian Inference and MCMC Bayesian Inference and MCMC Aryan Arbabi Partly based on MCMC slides from CSC412 Fall 2018 1 / 18 Bayesian Inference - Motivation Consider we have a data set D = {x 1,..., x n }. E.g each x i can be the

More information