Comparison study between MCMC-based and weight-based Bayesian methods for identification of joint distribution

Size: px
Start display at page:

Download "Comparison study between MCMC-based and weight-based Bayesian methods for identification of joint distribution"

Transcription

1 Struct Multidisc Optim (2010) 42: DOI /s RESEARCH PAPER Comparison study between MCMC-based and weight-based Bayesian methods for identification of joint distribution Yoojeong Noh K. K. Choi Ikjin Lee Received: 24 January 2010 / Revised: 16 June 2010 / Accepted: 30 June 2010 / Published online: 27 July 2010 c Springer-Verlag 2010 Abstract The Bayesian method is widely used to identify a joint distribution, which is modeled by marginal distributions and a copula. The joint distribution can be identified by one-step procedure, which directly tests all candidate joint distributions, or by two-step procedure, which first identifies marginal distributions and then copula. The weight-based Bayesian method using two-step procedure and the Markov chain Monte Carlo (MCMC)-based Bayesian method using one-step and twostep procedures were recently developed. In this paper, the one-step weight-based Bayesian method and two-step MCMC-based Bayesian method using the parametric marginal distributions are proposed. Comparison studies among the Bayesian methods have not been thoroughly carried out. In this paper, the weight-based and MCMC-based Bayesian methods using one-step and two-step procedures are compared to see which Bayesian method accurately and efficiently identifies a correct joint distribution through simulation studies. It is validated that the two-step weight-based Bayesian method has the best performance. Keywords Copula Identification of joint distribution Weight-based Bayesian method MCMC-based Bayesian method One-step procedure Two-step procedure Y. Noh K. K. Choi (B) I. Lee Department of Mechanical & Industrial Engineering, College of Engineering, The University of Iowa, Iowa City, IA 52242, USA kkchoi@engineering.uiowa.edu Y. Noh noh@engineering.uiowa.edu I. Lee ilee@engineering.uiowa.edu 1 Introduction In many engineering applications, it was found that input random variables such as fatigue material properties are correlated (Socie 2003; Annis 2004; Efstratios et al. 2004; Pham 2006). When the input random variables are correlated, the joint distribution needs to be obtained. For example, the reliability-based design optimization (RBDO) requires an accurate joint distribution of correlated input variables to obtain accurate optimum design (Noh et al. 2007, 2008). However, it is difficult to model the joint distribution from limited data in real engineering applications. For this, a copula can be used to generate a joint distribution by utilizing the correlation parameter and marginal distributions, which can be obtained from experimental data. To identify the correct joint distribution, the Bayesian method or goodness-of-fit (GOF) test can be used. Since it is known that the Bayesian method is more efficient and accurate in identifying the copula and marginal distributions than GOF (Noh et al. 2010), only the Bayesian method is used in this paper. The joint distribution can be obtained by a one-step or two-step procedure. The one-step Bayesian method identifies a joint distribution by directly testing all candidate joint distributions while the two-step Bayesian method identifies marginal distributions first, and then identifies a copula using the identified marginal distributions, which are used to construct a joint distribution (Huard et al. 2006; Genest et al. 1995; Hürliman 2004;Roch and Alegre 2006). The weight-based Bayesian method using the two-step procedure (Noh et al. 2010) andthemarkov chain Monte Carlo (MCMC)-based Bayesian method using one-step and two-step procedures (Silva and Lopes 2008) were recently developed. Simulations test results showed that those Bayesian methods have good performance of

2 824 Y. Noh et al. identifying a correct joint distribution. However, the weightbased Bayesian method using the one-step procedure was not investigated. The MCMC-based method using the twostep procedure was developed (Silva and Lopes 2008), but only the empirical marginal distribution, which is not often used in engineering applications, was considered. Thus, the one-step weight-based Bayesian method and the twostep MCMC-based Bayesian method using the parametric marginal distributions are developed in this paper. Even though the Bayesian methods have been investigated in many studies, it has not been thoroughly tested for which method has the best performance in identifying the joint distribution among various Bayesian methods. Through simulation tests, the weight-based and MCMCbased Bayesian methods using one-step and two-step procedures are compared in this paper to see how accurately and efficiently those methods identify a correct joint distribution for various types of joint distributions. In Section 2, the basic concept of the copula is introduced. The weight-based and MCMC-based Bayesian method using one-step and two-step procedures are illustrated in Sections 3 and 4, respectively. Section 5 shows the simulation results of comparison studies in terms of accuracy and efficiency of identifying a joint distribution. 2 Copula to represent joint distribution Consider a joint cumulative distribution function (CDF) F X1 X n (x 1,, x n ) of random variables X i for i = 1,, n. According to Sklar s theorem (Nelsen 1999), there exists a unique copula C such that F X1,...,X n (x 1,..., x n ) = C ( F X1 (x 1 ),..., F Xn (x n ) θ ) (1) where F Xi (x i ) is the marginal CDF of X i for i = 1,, n; and θ is the matrix of correlation parameters between X 1,..., X n. Taking the derivative of (1) with respect to x 1,, x n,the joint probability density function (PDF) is obtained as f X1 X n (x 1,, x n ) = c ( F X1 (x 1 ),, F Xn (x n ) θ ) n f Xi (x i ) (2) i=1 where c (u 1,, u n ) = n C(u 1,,u n ) u 1 u n is the copula density function with u i = F Xi (x i ), and f Xi (x i ) indicates the marginal PDF of X i for i =1,, n. Thus, the joint CDF and PDF can be constructed by combining marginal distributions and copula function. Most copula applications consider bivariate data because few copula families have n-dimensional generalization. Even though some copula families such as Archimedean can represent the joint distribution with n-dimensional correlated variables, but those only have one correlation coefficient for n correlated variables. It has often been observed that two input variables are correlated in many cases (Socie 2003; Annis 2004; Efstratios et al. 2004; Pham2006), so that only bivariate copulas are considered in this paper. To model the joint CDF using the bivariate copula, the correlation parameter θ needs to be obtained from experimental data. Since various types of copulas have their own correlation parameters, it is desirable to have a common correlation measure to obtain the correlation parameter from the experimental data. There are two commonly used correlation coefficients, Pearson s rho and Kendall s tau. The Pearson s rho (Pearson 1896) is used as a correlation measure of linear dependence between two variables. However, if the two variables have a nonlinear dependence, the correlation between two random variables cannot be accurately measured. On the other hand, since the Kendall s tau measures the correspondence of rankings between random variables (Kendall 1938; Kruskal1958), which is theoretically related to the definition of copulas, it can be used for various copulas with both linear and nonlinear dependence. Thus, in this paper, Kendall s tau is used. The population version of Kendall s tau (τ) can be obtained using the copula function and the correlation parameter θ as τ = 4 C (u,v θ ) dc (u,v) 1 (3) I 2 where I 2 = I I (I = [0, 1]), dc = u v dudv, and u = F X (x) and v = F Y (y) are marginal CDFs of X and Y, respectively (Nelsen 1999). Thesample version ofkendall s tau (t) is obtained as 2 C t = c d (4) c + d where c and d represent the number of concordant and discordant pairs of given data, respectively. Using the estimated Kendall s tau, the correlation parameter of the copula, θ, can be calculated because Kendall s tau can be expressed as a function of the correlation parameter as shown in (3). The explicit functions of (3) for some copulas that are used in this paper are presented in Noh et al. (2010). More theoretical explanations on copulas are presented in Nelsen (1999)andJoe(1997). 3 Weight-based Bayesian method The weight-based Bayesian method calculates the normalized weights of candidate models such as marginal distributions, copulas, or joint distributions using a probability

3 Comparison study between MCMC-based and weight-based Bayesian methods for identification of joint distribution 825 of a hypothesis on each candidate model, and selects a model with the highest normalized weight as the correct one. Sections 3.1 and 3.2 illustrate the one-step and two-step weight-based Bayesian method, respectively. 3.1 One-step weight-based Bayesian method Consider a hypothesis h ijk that the given data D come from a candidate joint distribution M ijk where i, j,andk indicate the indexes of seven candidate marginal distributions of X and Y, and nine copulas for i = 1,,7and j = 1,,7,and k = 1,, 9, respectively. That is, M ijk indicates a candidate jointdistributionmodeled byith and jth candidate marginal distributions and kth candidate copula. In this paper, the candidate marginal distributions for two random variables X and Y are Gaussian, Weibull, Gamma, Lognormal, Gumbel, Extreme type-i, and Extreme type-ii; whereas the candidate copulas are Clayton, AMH, Gumbel, Frank, A12, A14, FGM, Gaussian, and independent copulas. Thus, these are 441 (7 7 9) candidate joint distributions to be tested. To identify a joint distribution that best describes given data among candidates, the Bayesian method considers the probability of each hypothesis h ijk given data D as (Noh et al. 2010; Huard et al. 2006) Pr ( h ijk D, I ) = Pr ( D hijk, I ) Pr ( h ijk I ) Pr (D I ) where Pr(D h ijk, I ) is the conditional probability of drawing data D from the hypothesis on h ijk, Pr(h ijk I ) is the prior probability on the candidate, and Pr(D I ) is the normalization constant with any relevant additional knowledge I. Consider a parameter vector = [μ X,σ X,μ Y,σ Y,θ] T consisting of means and standard deviations of X and Y, respectively, and correlation parameter between X and Y. Assume that the standard deviations of X and Y are the fixed values, which are obtained from given data. Selecting μ X, μ Y,andθ as the nuisance variables, (5) can be written (5) Pr ( h ijk D, I ) = 1 1 Pr ( D hijk,,i ) Pr ( h ijk, I ) Pr ( I ) dμ X dμ Y dτ (6) Pr (D I ) Since each copula has its own correlation parameter θ, Kendall s tau τ, which is expressed as θ = rk 1 (τ), isused as a nuisance variable for kth copula in (6). The explicit equations for some copulas are presented in Noh et al. (2010). Equation (6) could be expressed in terms of five parameters, μ X, μ Y, σ X, σ Y,andτ.However,sincethefive dimensional integration requires significantly more computational effort and its performance is similar to the triple integration using two means μ X, μ Y and one correlation coefficient τ, the triple integration is used in this paper to calculate probability of the hypothesis on each candidate asshownin(6). In (6), two standard deviations, σ X and σ Y, could be used as the nuisance variables for the triple integration instead of μ X and μ Y. However, use of the means better identifies a correct distribution than use of the standard deviations (Noh et al. 2010). Pr(D h ijk,, I ) is the likelihood function of the parameter vector for data D from the hypothesis h ijk,whichis expressed as Pr ( D hijk,,i ) = L ( x, y, Mijk ) = ns m=1 f ijk XY (x m, y m ) (7) for given paired data x = [x 1,, x ns ] T and y =[y 1,, y ns ] T where ns is the number of paired data. In (7), f ijk XY ( ) is the candidate joint PDF of X and Y ; x m and y m represent the mth sample point for m = 1,, ns. Using the copula density function, the joint PDF for M ijk can be written as ( ) f ijk XY (x m, y m ) = c k FX i (x m μ X ), F j Y (y m μ Y ) rk 1 (τ) f i X (x m μ X ) f j Y (y m μ Y ) (8) for ith and jth candidate marginal distributions and kth candidate copula. In (6), since all candidates are equally probable with respect to, Pr(h ijk, I ) is obtained as Pr ( h ijk, I ) = { 1, ijk 0, / ijk and ijk is the domain of the parameter vector of the candidate M ijk.in(6), Pr( I ) is the prior distribution on as 1 Pr ( I ) = λ ( ), (10) 0, / (9)

4 826 Y. Noh et al. where is the domain of the parameter vector that users might know, and λ ( ) is the width of the domain. After substituting (7, 9,and10)into(6) and integrating it, the weight of each candidate joint distribution is defined as W ijk = ns ijk m=1 c k ( F i X (x m μ X ), F j Y (y m μ Y ) rk 1 λ ( μ X ) λ ( μ Y ) λ ( τ ) ) (τ) f X i (x m μ X ) f j Y (y m μ Y ) dμ X dμ Y dτ (11) where the normalization constant Pr(D I ) is not used for convenience and the prior is used as the domain of the integration. The triple integration of (11) is calculated using triplequad function, which uses adaptive Simpson quadrature, in a commercial code Matlab. The normalized weight is calculated by w ijk = 7 W ijk 7 i=1 j=1 k=1 (12) 9 W ijk Calculating the normalized weights of candidate joint distributions, the one with the highest normalized weight among 441 candidate joint distributions is selected as the correct joint distribution. If the prior distribution is known, then the correct model could be identified more often, especially when the number of samples is small. However, the prior distribution is usually unknown, and if the wrong prior distribution is used, a wrong model could be identified. Thus, in this paper, the uniform distribution is used as the prior distribution as shown in (10), which makes the calculation of the weight more depend on the data. As the number of samples increases, the effect of the prior distribution on calculation of the weight becomes negligible. 3.2 Two-step weight-based Bayesian method The two-step Bayesian method identifies marginal distributions of X and Y first, and then a copula using the identified marginal distributions. Using the same procedure as the one-step Bayesian method, the weight of each marginal distribution can be obtained as W i = 1 λ ( γ ) γ i γ ns m=1 f i X (x m a (γ,σ), b (γ,σ))dγ (13) by integrating the likelihood function of the ith candidate marginal distribution M i over the parameter γ (mean) where f i X (x m a (γ,σ), b (γ,σ)) is the ith marginal PDF evaluated at mth sample point x m of X for m = 1,, ns.in(13), a and b are the parameters of the ith marginal PDF, which are expressed in terms of mean γ and standard deviation σ (Noh et al. 2010). Equation (13) also can be used to calculate the weight of the jth candidate marginal PDF of Y for j = 1,,7. For the copula, the weight of each candidate copula can be defined as W k = 1 λ ( γ ) γ k γ ns m=1 ( ) r 1 c k u m,v m k (γ ) dγ (14) ( where the parameter γ is the Kendall s tau; c k um,v m rk 1 (γ ) ) is the copula density function value of kth candidate for k = 1,,9 at the identified marginal CDF values, u m = F x (x m ) and v m = F Y (y m ),form = 1,, ns. The normalized weights of (13 and 14) are obtained using (12), but the denominator is the summation of the weights of candidate marginal distributions or copulas. The one-dimensional integrations of (14 and 15) are calculated using quad function, which uses adaptive Simpson quadrature, in Matlab. In this process, there are 23 ( ) candidates to be tested, which is much less than 441 candidates that need to be tested for the one-step procedure. Thus, the two-step procedure is much more efficient. 4 Markov Chain Monte Carlo simulation-based Bayesian method The MCMC-based Bayesian method identifies a correct model (marginal distribution, copula, or joint distribution) among candidates using a criterion such as a deviance information criterion (DIC). Using the MCMC, samples of the parameter vector consisting of mean, standard deviation, and correlation parameter are randomly generated from the posterior distribution of the parameter vector, and those are used to calculate the DIC value for each candidate. The smaller the DIC value is, the better fit to the model the data is. Thus, the candidate with the lowest DIC value is identified as a correct model.

5 Comparison study between MCMC-based and weight-based Bayesian methods for identification of joint distribution One-step MCMC-based Bayesian method Since the posterior distribution is proportional to the likelihood function L(x, y, M ijk ) in (6) and prior distribution of the parameter vector g( ), it is written as g ijk ( x, y) L ( x, y, Mijk ) g ( ) (15) The prior distribution of the parameter vector is usually unknown, so that the uniform distribution is used as the prior distribution like the weight-based Bayesian method. To identify a model based on the given data, the MCMCbased Bayesian method requires the samples of the parameter vector to be obtained from the posterior distribution of the parameter vector, which is hard to obtain because the posterior distribution does not have a standard form such as Gaussian or lognormal. Thus, the MCMC simulation is used to obtain the posterior distribution of the parameter vector. The MCMC simulation is a method of generating random samples from probability distributions via Markov chains (Gamerman and Lopes 2006; Gelmanetal.2004; Robert and Casella 2004). The objective of the MCMC is to generate one or more values of a parameter vector. Rather than attempting to directly draw samples from the [ probability distribution of, g ijk ( x, y), a sequence (1), (2),, (t), ] is generated where each vector (t) in the sequence depends on the preceding ones, (1), (2),, (t 1). For a sufficiently large number t, i.e., 1000, (t) is approximately generated from g ijk ( x, y). The slice sampling (Neal 2003) and Metropolis-Hastings (Metropolis et al. 1953) are the most popular MCMC methods. However, the Metropolis-Hastings method requires determination of a proposal distribution, which is used to draw samples, whereas the slice sampling method does not. To produce samples efficiently using the Metropolis-Hastings method, it is important to select a good proposal distribution. Thus, the slice sampling method, which does not require the proposal distribution, is preferred. Consider a parameter variable, e.g., the mean. The first step of the slice sampling is to assume an initial value μ (t) within the domain of the posterior distribution g(μ x)ofthe given data x. Then, a value y is uniformly drawn from (0, g(μ (t) x)) as shown in Fig. 1a. Thus, a horizontal slice can be defined as S = {μ: y < g(μ x)}, and μ (t) is always within S, which is indicated by bold lines in Fig. 1b. Accordingly, in the second step, an interval I = (L, R) can be found around μ (t) within this interval that contains all, or much of the slice S. Let the length of the interval I be w where circular dots in Fig. 1b indicate the left and right bounds L and R of the interval I, respectively. The interval is expanded until both ends are outside of the slice as shown in Fig. 1b. In the third step, the sequential integer Fig. 1 Slice sampling (Neal 2003) t is increased to t + 1, and the new point μ (t+1) is found within the interval until a point inside the slice is found. Points that are picked outside the slice such as μ*infig.1c are used to reduce the interval size, indicated by rectangular dots in Fig. 1c. Steps 1 and 2 are repeated until the desired number of samples for the slice sampling, N = 1,000, is achieved. Using the slice sampling, N samples of the parameter vector [ (1), (2),, (N)] are obtained, and those are used to estimate the parameter of the original data or confidence interval of the parameter. In this paper, the samples of the parameter vector are used to identify a correct model among candidates using the DIC as following. The DIC is used to select a joint distribution that best fits the given data x and y. The DIC is defined as (Gelman et al. 2004; Spiegelhalter et al. 2002) DIC = D + p D = 2 D D ( ) (16) where D is the expectation of the deviance function, which is defined as D ( ) = 2log [ L ( x, y )], Mijk (17) p D is the effective number of parameters of the model, which is computed as p D = D D ( ) where is the expectation of the parameter vector. In (16), since the expectation of the deviance function D indicates how well the model fits to the data, the smaller D is, the better fit to the data the model is. On the other hand, p D indicates the complexity of the model, so that the larger p D value means it is easier for the model to fit the data. However, it does not necessarily mean that the complex

6 828 Y. Noh et al. model, i.e., large p D, better represents a true model than a less complex model. For example, a fifth-order polynomial can exactly fit six points. However, if those six points are not properly distributed, then the higher-order polynomial is not useful to represent a true response. Thus, a moderate model with a better fit to the data (with small D and p D ), that is, the one with the smallest DIC will be selected as a correct model. Using (16), the DIC of a candidate M ijk can be written as DIC ( ) [ ] M ijk = 2E D ( ) x, y, Mijk D ( E [ ]) x, y, Mijk (18) where E[D( ) x, y,m ijk ] is approximated as N D ( (l)) l=1 E [ D ( ) ] x, y, Mijk N where E[ x, y,m ijk ] can be approximated as N (l) l=1 (19) E [ ] x, y, Mijk (20) N Substituting (20) into the deviance function in (17), and calculating (19), the DIC value can be calculated for each candidate M ijk. 4.2 Two-step MCMC-based Bayesian method For the two-step MCMC-based Bayesian method, the likelihood function of the ith candidate marginal distribution M i with mean and standard deviation of X is defined as L (x μ X,σ X, M i ) = ns m=1 f i X (x μ X,σ X ) (21) For Y, L(y μ Y, σ Y, M j ) is the likelihood function of the jth candidate marginal distribution M j, f j Y (y μ Y,σ Y ). Likewise, the likelihood function of the kth candidate copula M k is defined as L (x, y θ, M k ) = ns m=1 c k (x, y θ ) (22) Using the likelihood functions of the marginal distribution and copula, the deviance functions can be obtained. Using the deviance functions and generated samples of the parameter vectors from the slice sampling, the DIC values of candidate marginal distributions and copulas can be obtained using (18). As in the case of the weight-based Bayesian method, for the one-step procedure, there are 441 candidates that need to be tested whereas 23 candidates need to be tested for the two-step procedure. Table 1 Four cases Marginal distributions Copula Kendall s X Y Case 1 Extreme-II Extreme-I Gumbel 0.7 Case 2 Gaussian Weibull Frank 0.7 Case 3 Weibull Gumbel A Case 4 Lognormal Extreme-II Clayton Comparison of methods In Sections 5.1 and 5.2, the weight-based and MCMCbased Bayesian methods using one-step and two-step procedures are compared in terms of accuracy and efficiency, respectively. 5.1 Accuracy test Since it is impossible to show simulation results for all candidate joint distributions, i.e., 441, in the paper, four joint distributions with various combinations of marginal distributions, copulas, and Kendall s tau are considered as true models as shown in Table 1. Even though only bivariate joint distributions with positive correlation are only tested in this paper, those with negative correlation will yield similar identification results because the joint PDFs with negative correlation have the rotated shapes of those with positive correlation. Moreover, multivariate distributions are not commonly used in practical applications, so that those are not considered. The means and standard deviations are given as μ X = μ Y = 5.0 and σ X = σ Y = 2.5 for X and Y, respectively. The parameters of non-gaussian candidate marginal distributions are calculated from the given mean and standard deviation using some explicit functions (Noh Fig. 2 Joint PDF contours of four cases tau

7 Comparison study between MCMC-based and weight-based Bayesian methods for identification of joint distribution 829 Clayton AMH Gumbel Frank A12 A FGM Gaussian 0.02 Independent Fig. 3 Averaged normalized weights using one-step weight-based Bayesian method et al. 2010). Likewise, the correlation parameters of candidate copulas are obtained using (3) or explicit functions in Noh et al. (2010). The joint PDF contours of four cases are shown in Fig. 2. For the identification of a correct joint distribution, a data set is randomly generated from a true joint distribution, and then a joint distribution that best fits to the data among candidates is selected as the correct one based on the estimated weight or DIC values. To test the performance of the weight-based and MCMC-based methods, the above procedure is repeated 100 times for ns = 30, 100, and 300. Using the randomly generated 100 data sets with different number of samples, the averaged normalized weights and number of correct identifications are calculated. Fig. 4 Averaged normalized weights using two-step weight-based Bayesian method Margin Gau Wei Gam Log Gum Ext-I Ext-II X Y Clay AMH Gum Frank A12 A14 FGM Gau Ind

8 830 Y. Noh et al. Table 2 Averaged normalized weights of correct joint CDFs using one-step weight-based method ns = 30 ns = 100 ns= 300 Case Case Comparison of one-step and two-step weight-based Bayesian methods Using each dataset with specified sample data size ns that is randomly generated from the true joint distribution, the onestep weight-based Bayesian method calculates normalized weights of all 441 candidate joint distributions. Figure 3 shows the averaged normalized weights of 441 candidates over 100 trials using the one-step weight-based Bayesian method for sample data of size ns = 30 obtained from the true joint distribution Case 3 in Table 1, which is modeled by Weibull and Gumbel distributions, and A12 copula. In Fig. 3, each candidate copula has 7 7 matrix indicating seven candidate marginal distributions for the row (X) andcolumn(y ), respectively, with the order of marginal distributions as Gaussian, Weibull, Gamma, Lognormal, Gumbel, Extreme, and Extreme-II. For example, the first row and second column indicates that X and Y have Gaussian and Weibull distributions, respectively. Since the sum of the normalized weight for all 441 candidates is one, the normalized weights of many candidates are very small and some of them have zero values, which are shown as blanks in Fig. 3. Even though A12, which is the correct copula, has the higher normalized weights than other copulas in Fig.3, the normalized weight of the correct model, indicated as a shaded box in Fig. 3, is only 0.060, which could make the identification of the correct joint distribution difficult. On the other hand, the two-step weight-based Bayesian method calculates the weights of seven candidate marginal distributions of X and Y, and then calculates the weights of nine candidate copulas using the identified marginal distri- Table 4 Number of correct identification of joint CDFs using weightbased Bayesian method Joint One-step Two-step distributions ns=30 ns=100 ns=300 ns=30 ns=100 ns=300 Case Case Case Case butions. Thus, 23 ( ) candidates of marginal distributions and copulas are tested. Since the two-step Bayesian method separately calculates the normalized weights of seven marginal distributions and nine copulas, the normalized weights of the correct marginal distributions and copula are easily distinguishable compared with the one-step Bayesian method, as shown in Fig. 4. Next, the averaged normalized weights of candidate models are calculated using the one-step and two-step weight-based Bayesian methods for two cases, Case 1 and 3, using different number of samples, ns = 30, 100, and 300. In case of the one-step Bayesian method, since it is too long to show results like Fig. 3 for all cases with different number of samples, the averaged normalized weights of the correct models are only presented as shown in Table 2. Since Case 1 has a distinct PDF shapes among candidates compared to Case 3, the averaged normalized weights of the correct joint distribution for Case 1 are larger than Case 3. However, the normalized weights of Case 1 using the one-step weight-based method are still not larger than the normalized weights using the two-step weightbased method, especially for a small number of samples (Table 3). Accordingly, as shown in Table 4, the number of correct identifications using the one-step weight-based Bayesian method is smaller than the one using the two-step weight-based Bayesian method. As the number of samples increases, the performance of the one-step weight-based Table 3 Averaged normalized weights of correct marginal CDFs and copulas using two-step weight-based method ns = 30 ns = 100 ns = 300 Case 1 X Y Copula Case 3 X Y Copula Table 5 Number of correct identification of joint CDFs using MCMCbased Bayesian method Joint One step Two step distributions ns=30 ns=100 ns=300 ns=30 ns=100 ns=300 Case Case Case Case

9 Comparison study between MCMC-based and weight-based Bayesian methods for identification of joint distribution 831 method becomes improved, but it is still not as good as the performance of the two-step weight-based method. When the joint distribution shapes are not distinct, as in Case 3, it becomes more challenging for the one-step Bayesian method to identify the correct joint distribution because the candidate joint distributions with similar PDF shapes to the correct joint distribution will have similar normalized weights to the correct one. Thus, the two-step weightbased method is preferred to the one-step weight-based method Comparison of one-step and two-step MCMC-based Bayesian methods Table 5 shows the number of correct identifications of four cases using one-step and two-step MCMC-based Bayesian methods for ns = 30, 100, and 300. A fair number of candidate joint distributions have similar PDF shapes to the correct one, which lead to similar DIC values of all candidate joint distributions especially for Case 3. The marginal PDF or copula shapes are more distinctive than joint PDFs, which leads to rather different DIC values of candidate marginal distributions and copulas. Thus, the one-step Bayesian method is more difficult to identify the correct joint distribution than the two-step Bayesian method. Similar to Table 4, the number of correct identifications using the one-step MCMC-based method is smaller than using the two-step MCMC-based method. As the number of samples increases, the performance of the one-step MCMC-based Bayesian method is improved, but not as effective as the two-step MCMC-based Bayesian method. The one-step method might identify a correct joint distribution among candidate joint distributions that the two-step method does not even test. However, when the one-step method is used, the number of marginal distributions and copulas should be cautiously determined. For example, assume that the numbers of candidate marginal distributions and copulas are 8 and 10, respectively. Even though one candidate marginal distribution and copula are added to the original candidates, the total number of candidate joint distributions is increased from (441) to (640). In this case, it is more confusing to identify a correct joint distribution among 640 candidates Comparison of two-step weight-based and MCMC-based Bayesian methods In Sections and 5.1.2, the one-step and two-step Bayesian method using the weight-based and MCMC-based methods, respectively, are compared. From the simulation results, the two-step Bayesian methods are preferred to the one-step Bayesian methods. In this section, the two-step weight-based and MCMCbased Bayesian methods are compared. As shown in Tables 4 and 5, the performances of the two methods are very similar. However, the MCMC-based Bayesian method depends on random samples of the parameter vector obtained from the slice sampling. When the true distributions do not have distinct shapes, as in Case 3, the randomness of the samples of the parameter vector could affect the performance. For example, even though the same data is used to generate the random samples of the parameter vector, the identified distribution could be different according to the generated samples of the parameter vector. Suppose that the true model is Case 3 and two data sets with ns = 30 are generated. Table 6 shows the DIC values of seven candidate marginal distributions for X and Y obtained from two different sets of samples of the parameter vector. In this case, two different sets correctly identify A12, which is the correct copula, so that the DIC values of candidate copulas are not presented. However, the twostep MCMC-based method using Set 1 identifies Weibull and Gumbel (true model) as correct marginal distributions whereas the one using Set 2 identifies Gamma as correct marginal distributions for X and Y. This is because the joint PDF shapes of Case 3 and the identified model (Gamma, Gamma, and Clayton copula) are similar as shown in Fig. 5. Thus, as shown in Table 6, the DIC values of Weibull, Gamma, and Gumbel distributions are very similar, so that different marginal distributions are identified for two different sets of samples of the parameter vector even though the same data are used. To avoid this problem, the number of samples of the parameter vector is increased up to 2,000, but the results are still inconsistent and computational time is rapidly increased. It could be interesting to test cases with different correlation coefficient, but the general trend will not be changed due to the randomness of slice samples used in MCMC-based method. Thus, the Table 6 DIC values for identification of marginal CDFs for Case 3 Two-step approach Slice Sampling Margin Gaussian Weibull Gamma Lognormal Gumbel Ext. I Ext. II Set 1 X Y Set 2 X Y

10 832 Y. Noh et al. Table 8 Computational time using two-step weight-based and MCMC-based Bayesian methods (copula) Methods Copula ns = 30 ns = 100 ns = 300 MCMC s s s Weight s s s Fig. 5 PDF contours obtained from two different sets of slice sampling two-step weight-based Bayesian method is preferred over the two-step MCMC-based Bayesian method. 5.2 Efficiency test To test how the Bayesian methods efficiently identify a correct joint distribution among candidates, the computational time is calculated for one data set, which is randomly generated from a true distribution. Since the computational times are similarly estimated for four cases, Case 1 is considered. Table 7 shows the computational time when the two-step weight-based and MCMC-based Bayesian methods are used to identify the correct marginal distribution among seven candidates. The weight-based Bayesian method is more efficient than the MCMC-based Bayesian method because the MCMC-based Bayesian method takes more time to generate random samples, i.e., N = 1,000. Likewise, Table 8 shows the computational time when two-step weight-based and MCMC-based Bayesian methods are used to identify a copula among nine candidates. Again, the two-step weight-based method is more efficient than the two-step MCMC-based method in identifying the correct copula. Table 9 displays the computational time to identify a joint distribution using the weight-based and MCMC-based Bayesian methods using one-step and two-step procedures. In the two-step Bayesian methods, the total computational Table 7 Computational time using two-step weight-based and MCMC-based Bayesian methods (marginal CDF) Methods Marginal CDF ns = 30 ns = 100 ns = 300 MCMC s s s Weight s s s time to identify a joint distribution is calculated by summing up double computational times for X and Y in Table 7, and those for copula in Table 8. As shown in Table 9, since the one-step Bayesian method calculates the weights of 441 candidate joint distributions, the computational time using the one-step Bayesian methods is much larger than that using the two-step Bayesian methods. The weight-based Bayesian methods integrate the likelihood function of all candidate joint distributions over means of X and Y and Kendall s tau, whereas the MCMCbased Bayesian methods generate random samples of the parameter vector for all candidates. Since calculating integrations is more efficient than generating random samples and calculating DIC values, the weight-based method is more efficient than the MCMC-based method as shown in Table 9. In summary, the two-step Bayesian methods identify a correct joint distribution more accurately and efficiently than the one-step Bayesian method. In terms of accuracy, the two-step weight-based Bayesian method is similar to the two-step MCMC-based Bayesian method, but the twostep weight-based method identifies the correct distribution more efficiently than the two-step MCMC-based method. Moreover, the normalized weights are consistently calculated for given data, but the DIC values can be differently calculated according to the randomly generated samples of the parameter vector. Thus, the two-step weight-based Bayesian method is preferred to the one-step weight-based Bayesian method and one-step and two-step MCMC-based Bayesian methods. Table 9 Computational time using weight-based and MCMC-based Bayesian method using one-step and two-step procedures (joint CDF) Methods Weight MCMC ns = 30 One-step s 1,649 s Two-step s s ns = 100 One-step 1,218 s 2,008 s Two-step s s ns = 300 One-step 3,045 s 4,956 s Two-step s s

11 Comparison study between MCMC-based and weight-based Bayesian methods for identification of joint distribution 833 6Conclusion Different Bayesian methods were proposed to identify correct models for marginal distribution, copula, and joint distribution in literatures, but it has not been tested, which Bayesian method more accurately and efficiently identifies a correct mode. In this paper, the two recently developed Bayesian methods, weight-based and MCMC-based Bayesian methods, are compared using one-step and twostep procedures though simulation studies. For the comparison studies, a one-step weight-based method and a two-step MCMC-based method using parametric marginal distributions are developed in this paper. Through simulation studies, it is demonstrated that the two-step approach identifies the correct joint distribution more accurately and efficiently than the one-step approach for both weight-based and MCMC-based methods. The twostep weight-based and MCMC-based Bayesian methods show similar performance in identifying a correct joint distribution. However, according to randomness of generated samples of the parameter vector, the identified model using the MCMC-based method could be different even though the same data are used. On the other hand, the weightbased Bayesian method identifies the same model as long as the same data are used. In addition, the two-step weightbased method is far more efficient when calculating the weights of the candidate marginal distribution and copulas than the two-step MCMC-based method. Thus, the two-step weight-based Bayesian method is the preferred method. Acknowledgments This research is supported by the Automotive Research Center, which is sponsored by the U.S. Army TARDEC, and ARO Project W911NF This support is greatly appreciated. References Annis C (2004) Probabilistic life prediction isn t as easy as it looks. Journal of ASTM International 1(2):3 14 Efstratios N, Ghiocel D, Singhal S (2004) Engineering design reliability handbook. CRC, New York Gamerman D, Lopes HF (2006) Markov Chain Monte Carlo: stochastic simulation for Bayesian inference, 2nd edn. Chapman & Hall, London Gelman A, Carlin JB, Stern HS, Rubin DB (2004) Bayesian data analysis, 2nd edn. Chapman & Hall/CRC, London Genest C, Ghoudi K, Rivest LP (1995) A semiparametric estimation procedure of dependence parameters in multivariate families of distribution. Biometrika 82(3): Huard D, Évin G, Favre A-C (2006) Bayesian copula selection. Comput Stat Data Anal COMSTA (2): Hürliman W (2004) Fitting bivariate cumulative returns with copulas. Comput Stat Data Anal 45(2): Joe H (1997) Multivariate models and dependence concepts. Chapman & Hall, London Kendall M (1938) A new measure of rank correlation. Biometrika 30:81 89 Kruskal WH (1958) Ordinal measures of associations. J Am Stat Assoc 53(284): Metropolis N, Rosenbluth AW, Rosenbluth MN, Teller AH, Teller E (1953) Equations of state calculations by fast computing machines. J Chem Phys 21(6): Neal RM (2003) Slice sampling. Ann Stat 31(3): Nelsen RB (1999) An introduction to copulas. Springer, New York Noh Y, Choi KK, Du L (2007) New transformation of dependent input variables using copula for RBDO. In: 7th world congress on structural and multidisciplinary optimization, May 21 25, Seoul, Korea Noh Y, Choi KK, Du L (2008) Reliability based design optimization of problems with correlated input variables using copulas. Struct Multidisc Optim 38(1):1 16 Noh Y, Choi KK, Lee I (2010) Identification of marginal and joint CDFs Using Bayesian Method for RBDO. Struct Multidisc Optim 40(1):35 51 Pearson K (1896) Mathematical contributions to the theory of evolution. III. Regression, heredity and panmixia. Philos Trans R Soc Lond Ser A 187: Pham H (2006) Springer handbook of engineering statistics. Springer, London Robert CP, Casella G (2004) Monte Carlo statistical methods, 2nd edn. Springer, New York Roch O, Alegre A (2006) Testing the bivariate distribution of daily equity returns using copulas. An application to the Spanish Stockmarket. Comput Stat Data Anal 51(2): Silva RS, Lopes HF (2008) Copula, marginal distributions and model selection: a Bayesian note. Stat Comput 18(3): Socie DF (2003) Seminar notes: Probabilistic aspects of fatigue. URL: (cited May, ) Spiegelhalter DJ, Best NG, Carlin BP, Linde A (2002) Bayesian measures of model complexity and fit (with discussion). J R Stat Soc Ser B 64:

Identification of marginal and joint CDFs using Bayesian method for RBDO

Identification of marginal and joint CDFs using Bayesian method for RBDO Struct Multidisc Optim (2010) 40:35 51 DOI 10.1007/s00158-009-0385-1 RESEARCH PAPER Identification of marginal and joint CDFs using Bayesian method for RBDO Yoojeong Noh K. K. Choi Ikjin Lee Received:

More information

IN MANY reliability-based design optimization (RBDO)

IN MANY reliability-based design optimization (RBDO) AIAA JOURNAL Vol 47, No 4, April 009 Reduction of Ordering Effect in Reliability-Based Design Optimization Using Dimension Reduction Method Yoojeong Noh, K K Choi, and Ikjin Lee University of Iowa, Iowa

More information

Bayesian Inference. Chapter 1. Introduction and basic concepts

Bayesian Inference. Chapter 1. Introduction and basic concepts Bayesian Inference Chapter 1. Introduction and basic concepts M. Concepción Ausín Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master

More information

A note on Reversible Jump Markov Chain Monte Carlo

A note on Reversible Jump Markov Chain Monte Carlo A note on Reversible Jump Markov Chain Monte Carlo Hedibert Freitas Lopes Graduate School of Business The University of Chicago 5807 South Woodlawn Avenue Chicago, Illinois 60637 February, 1st 2006 1 Introduction

More information

Estimation Under Multivariate Inverse Weibull Distribution

Estimation Under Multivariate Inverse Weibull Distribution Global Journal of Pure and Applied Mathematics. ISSN 097-768 Volume, Number 8 (07), pp. 4-4 Research India Publications http://www.ripublication.com Estimation Under Multivariate Inverse Weibull Distribution

More information

Bivariate Degradation Modeling Based on Gamma Process

Bivariate Degradation Modeling Based on Gamma Process Bivariate Degradation Modeling Based on Gamma Process Jinglun Zhou Zhengqiang Pan Member IAENG and Quan Sun Abstract Many highly reliable products have two or more performance characteristics (PCs). The

More information

Marginal Specifications and a Gaussian Copula Estimation

Marginal Specifications and a Gaussian Copula Estimation Marginal Specifications and a Gaussian Copula Estimation Kazim Azam Abstract Multivariate analysis involving random variables of different type like count, continuous or mixture of both is frequently required

More information

Estimation of Copula Models with Discrete Margins (via Bayesian Data Augmentation) Michael S. Smith

Estimation of Copula Models with Discrete Margins (via Bayesian Data Augmentation) Michael S. Smith Estimation of Copula Models with Discrete Margins (via Bayesian Data Augmentation) Michael S. Smith Melbourne Business School, University of Melbourne (Joint with Mohamad Khaled, University of Queensland)

More information

Bayesian Defect Signal Analysis

Bayesian Defect Signal Analysis Electrical and Computer Engineering Publications Electrical and Computer Engineering 26 Bayesian Defect Signal Analysis Aleksandar Dogandžić Iowa State University, ald@iastate.edu Benhong Zhang Iowa State

More information

Songklanakarin Journal of Science and Technology SJST R1 Sukparungsee

Songklanakarin Journal of Science and Technology SJST R1 Sukparungsee Songklanakarin Journal of Science and Technology SJST-0-0.R Sukparungsee Bivariate copulas on the exponentially weighted moving average control chart Journal: Songklanakarin Journal of Science and Technology

More information

STA 4273H: Statistical Machine Learning

STA 4273H: Statistical Machine Learning STA 4273H: Statistical Machine Learning Russ Salakhutdinov Department of Computer Science! Department of Statistical Sciences! rsalakhu@cs.toronto.edu! h0p://www.cs.utoronto.ca/~rsalakhu/ Lecture 7 Approximate

More information

Financial Econometrics and Volatility Models Copulas

Financial Econometrics and Volatility Models Copulas Financial Econometrics and Volatility Models Copulas Eric Zivot Updated: May 10, 2010 Reading MFTS, chapter 19 FMUND, chapters 6 and 7 Introduction Capturing co-movement between financial asset returns

More information

MULTIDIMENSIONAL POVERTY MEASUREMENT: DEPENDENCE BETWEEN WELL-BEING DIMENSIONS USING COPULA FUNCTION

MULTIDIMENSIONAL POVERTY MEASUREMENT: DEPENDENCE BETWEEN WELL-BEING DIMENSIONS USING COPULA FUNCTION Rivista Italiana di Economia Demografia e Statistica Volume LXXII n. 3 Luglio-Settembre 2018 MULTIDIMENSIONAL POVERTY MEASUREMENT: DEPENDENCE BETWEEN WELL-BEING DIMENSIONS USING COPULA FUNCTION Kateryna

More information

Precision Engineering

Precision Engineering Precision Engineering 38 (2014) 18 27 Contents lists available at ScienceDirect Precision Engineering j o ur nal homep age : www.elsevier.com/locate/precision Tool life prediction using Bayesian updating.

More information

Trivariate copulas for characterisation of droughts

Trivariate copulas for characterisation of droughts ANZIAM J. 49 (EMAC2007) pp.c306 C323, 2008 C306 Trivariate copulas for characterisation of droughts G. Wong 1 M. F. Lambert 2 A. V. Metcalfe 3 (Received 3 August 2007; revised 4 January 2008) Abstract

More information

On the Optimal Scaling of the Modified Metropolis-Hastings algorithm

On the Optimal Scaling of the Modified Metropolis-Hastings algorithm On the Optimal Scaling of the Modified Metropolis-Hastings algorithm K. M. Zuev & J. L. Beck Division of Engineering and Applied Science California Institute of Technology, MC 4-44, Pasadena, CA 925, USA

More information

How to select a good vine

How to select a good vine Universitetet i Oslo ingrihaf@math.uio.no International FocuStat Workshop on Focused Information Criteria and Related Themes, May 9-11, 2016 Copulae Regular vines Model selection and reduction Limitations

More information

Modelling Operational Risk Using Bayesian Inference

Modelling Operational Risk Using Bayesian Inference Pavel V. Shevchenko Modelling Operational Risk Using Bayesian Inference 4y Springer 1 Operational Risk and Basel II 1 1.1 Introduction to Operational Risk 1 1.2 Defining Operational Risk 4 1.3 Basel II

More information

MEASUREMENT UNCERTAINTY AND SUMMARISING MONTE CARLO SAMPLES

MEASUREMENT UNCERTAINTY AND SUMMARISING MONTE CARLO SAMPLES XX IMEKO World Congress Metrology for Green Growth September 9 14, 212, Busan, Republic of Korea MEASUREMENT UNCERTAINTY AND SUMMARISING MONTE CARLO SAMPLES A B Forbes National Physical Laboratory, Teddington,

More information

Kobe University Repository : Kernel

Kobe University Repository : Kernel Kobe University Repository : Kernel タイトル Title 著者 Author(s) 掲載誌 巻号 ページ Citation 刊行日 Issue date 資源タイプ Resource Type 版区分 Resource Version 権利 Rights DOI URL Note on the Sampling Distribution for the Metropolis-

More information

Chapter 1. Bayesian Inference for D-vines: Estimation and Model Selection

Chapter 1. Bayesian Inference for D-vines: Estimation and Model Selection Chapter 1 Bayesian Inference for D-vines: Estimation and Model Selection Claudia Czado and Aleksey Min Technische Universität München, Zentrum Mathematik, Boltzmannstr. 3, 85747 Garching, Germany cczado@ma.tum.de

More information

Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method

Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method Slice Sampling with Adaptive Multivariate Steps: The Shrinking-Rank Method Madeleine B. Thompson Radford M. Neal Abstract The shrinking rank method is a variation of slice sampling that is efficient at

More information

Downloaded from:

Downloaded from: Camacho, A; Kucharski, AJ; Funk, S; Breman, J; Piot, P; Edmunds, WJ (2014) Potential for large outbreaks of Ebola virus disease. Epidemics, 9. pp. 70-8. ISSN 1755-4365 DOI: https://doi.org/10.1016/j.epidem.2014.09.003

More information

Reliability-based design optimization of problems with correlated input variables using a Gaussian Copula

Reliability-based design optimization of problems with correlated input variables using a Gaussian Copula Struct Multidisc Optim DOI 0.007/s0058-008-077-9 RESEARCH PAPER Reliability-based design optimization of problems with correlated input variables using a Gaussian Copula Yoojeong Noh K. K. Choi Liu Du

More information

EVANESCE Implementation in S-PLUS FinMetrics Module. July 2, Insightful Corp

EVANESCE Implementation in S-PLUS FinMetrics Module. July 2, Insightful Corp EVANESCE Implementation in S-PLUS FinMetrics Module July 2, 2002 Insightful Corp The Extreme Value Analysis Employing Statistical Copula Estimation (EVANESCE) library for S-PLUS FinMetrics module provides

More information

Copulas. Mathematisches Seminar (Prof. Dr. D. Filipovic) Di Uhr in E

Copulas. Mathematisches Seminar (Prof. Dr. D. Filipovic) Di Uhr in E Copulas Mathematisches Seminar (Prof. Dr. D. Filipovic) Di. 14-16 Uhr in E41 A Short Introduction 1 0.8 0.6 0.4 0.2 0 0 0.2 0.4 0.6 0.8 1 The above picture shows a scatterplot (500 points) from a pair

More information

Modelling Dropouts by Conditional Distribution, a Copula-Based Approach

Modelling Dropouts by Conditional Distribution, a Copula-Based Approach The 8th Tartu Conference on MULTIVARIATE STATISTICS, The 6th Conference on MULTIVARIATE DISTRIBUTIONS with Fixed Marginals Modelling Dropouts by Conditional Distribution, a Copula-Based Approach Ene Käärik

More information

Bayesian Inference for the Multivariate Normal

Bayesian Inference for the Multivariate Normal Bayesian Inference for the Multivariate Normal Will Penny Wellcome Trust Centre for Neuroimaging, University College, London WC1N 3BG, UK. November 28, 2014 Abstract Bayesian inference for the multivariate

More information

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version)

A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) A quick introduction to Markov chains and Markov chain Monte Carlo (revised version) Rasmus Waagepetersen Institute of Mathematical Sciences Aalborg University 1 Introduction These notes are intended to

More information

Copulas. MOU Lili. December, 2014

Copulas. MOU Lili. December, 2014 Copulas MOU Lili December, 2014 Outline Preliminary Introduction Formal Definition Copula Functions Estimating the Parameters Example Conclusion and Discussion Preliminary MOU Lili SEKE Team 3/30 Probability

More information

Copula modeling for discrete data

Copula modeling for discrete data Copula modeling for discrete data Christian Genest & Johanna G. Nešlehová in collaboration with Bruno Rémillard McGill University and HEC Montréal ROBUST, September 11, 2016 Main question Suppose (X 1,

More information

Bayesian Regression Linear and Logistic Regression

Bayesian Regression Linear and Logistic Regression When we want more than point estimates Bayesian Regression Linear and Logistic Regression Nicole Beckage Ordinary Least Squares Regression and Lasso Regression return only point estimates But what if we

More information

IGD-TP Exchange Forum n 5 WG1 Safety Case: Handling of uncertainties October th 2014, Kalmar, Sweden

IGD-TP Exchange Forum n 5 WG1 Safety Case: Handling of uncertainties October th 2014, Kalmar, Sweden IGD-TP Exchange Forum n 5 WG1 Safety Case: Handling of uncertainties October 28-30 th 2014, Kalmar, Sweden Comparison of probabilistic and alternative evidence theoretical methods for the handling of parameter

More information

Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011

Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011 Copula Regression RAHUL A. PARSA DRAKE UNIVERSITY & STUART A. KLUGMAN SOCIETY OF ACTUARIES CASUALTY ACTUARIAL SOCIETY MAY 18,2011 Outline Ordinary Least Squares (OLS) Regression Generalized Linear Models

More information

Theory and Methods of Statistical Inference

Theory and Methods of Statistical Inference PhD School in Statistics cycle XXIX, 2014 Theory and Methods of Statistical Inference Instructors: B. Liseo, L. Pace, A. Salvan (course coordinator), N. Sartori, A. Tancredi, L. Ventura Syllabus Some prerequisites:

More information

The effect of ignoring dependence between failure modes on evaluating system reliability

The effect of ignoring dependence between failure modes on evaluating system reliability Struct Multidisc Optim (2015) 52:251 268 DOI 10.1007/s00158-015-1239-7 RESEARCH PAPER The effect of ignoring dependence between failure modes on evaluating system reliability Chanyoung Park 1 & Nam H.

More information

Imputation Algorithm Using Copulas

Imputation Algorithm Using Copulas Metodološki zvezki, Vol. 3, No. 1, 2006, 109-120 Imputation Algorithm Using Copulas Ene Käärik 1 Abstract In this paper the author demonstrates how the copulas approach can be used to find algorithms for

More information

Strong Lens Modeling (II): Statistical Methods

Strong Lens Modeling (II): Statistical Methods Strong Lens Modeling (II): Statistical Methods Chuck Keeton Rutgers, the State University of New Jersey Probability theory multiple random variables, a and b joint distribution p(a, b) conditional distribution

More information

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Statistical Methods in Particle Physics Lecture 1: Bayesian methods Statistical Methods in Particle Physics Lecture 1: Bayesian methods SUSSP65 St Andrews 16 29 August 2009 Glen Cowan Physics Department Royal Holloway, University of London g.cowan@rhul.ac.uk www.pp.rhul.ac.uk/~cowan

More information

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS

GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS Statistica Sinica 20 (2010), 441-453 GOODNESS-OF-FIT TESTS FOR ARCHIMEDEAN COPULA MODELS Antai Wang Georgetown University Medical Center Abstract: In this paper, we propose two tests for parametric models

More information

Robustness of a semiparametric estimator of a copula

Robustness of a semiparametric estimator of a copula Robustness of a semiparametric estimator of a copula Gunky Kim a, Mervyn J. Silvapulle b and Paramsothy Silvapulle c a Department of Econometrics and Business Statistics, Monash University, c Caulfield

More information

Simulation of Tail Dependence in Cot-copula

Simulation of Tail Dependence in Cot-copula Int Statistical Inst: Proc 58th World Statistical Congress, 0, Dublin (Session CPS08) p477 Simulation of Tail Dependence in Cot-copula Pirmoradian, Azam Institute of Mathematical Sciences, Faculty of Science,

More information

Statistical Inference for Stochastic Epidemic Models

Statistical Inference for Stochastic Epidemic Models Statistical Inference for Stochastic Epidemic Models George Streftaris 1 and Gavin J. Gibson 1 1 Department of Actuarial Mathematics & Statistics, Heriot-Watt University, Riccarton, Edinburgh EH14 4AS,

More information

On consistency of Kendall s tau under censoring

On consistency of Kendall s tau under censoring Biometria (28), 95, 4,pp. 997 11 C 28 Biometria Trust Printed in Great Britain doi: 1.193/biomet/asn37 Advance Access publication 17 September 28 On consistency of Kendall s tau under censoring BY DAVID

More information

eqr094: Hierarchical MCMC for Bayesian System Reliability

eqr094: Hierarchical MCMC for Bayesian System Reliability eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167

More information

Multimodal Nested Sampling

Multimodal Nested Sampling Multimodal Nested Sampling Farhan Feroz Astrophysics Group, Cavendish Lab, Cambridge Inverse Problems & Cosmology Most obvious example: standard CMB data analysis pipeline But many others: object detection,

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As

More information

Bayesian inference & Markov chain Monte Carlo. Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder

Bayesian inference & Markov chain Monte Carlo. Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder Bayesian inference & Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly provided by Paul Lewis and Mark Holder Note 2: Paul Lewis has written nice software for demonstrating Markov

More information

Bayesian modelling. Hans-Peter Helfrich. University of Bonn. Theodor-Brinkmann-Graduate School

Bayesian modelling. Hans-Peter Helfrich. University of Bonn. Theodor-Brinkmann-Graduate School Bayesian modelling Hans-Peter Helfrich University of Bonn Theodor-Brinkmann-Graduate School H.-P. Helfrich (University of Bonn) Bayesian modelling Brinkmann School 1 / 22 Overview 1 Bayesian modelling

More information

Quantile POD for Hit-Miss Data

Quantile POD for Hit-Miss Data Quantile POD for Hit-Miss Data Yew-Meng Koh a and William Q. Meeker a a Center for Nondestructive Evaluation, Department of Statistics, Iowa State niversity, Ames, Iowa 50010 Abstract. Probability of detection

More information

A measure of radial asymmetry for bivariate copulas based on Sobolev norm

A measure of radial asymmetry for bivariate copulas based on Sobolev norm A measure of radial asymmetry for bivariate copulas based on Sobolev norm Ahmad Alikhani-Vafa Ali Dolati Abstract The modified Sobolev norm is used to construct an index for measuring the degree of radial

More information

Bagging During Markov Chain Monte Carlo for Smoother Predictions

Bagging During Markov Chain Monte Carlo for Smoother Predictions Bagging During Markov Chain Monte Carlo for Smoother Predictions Herbert K. H. Lee University of California, Santa Cruz Abstract: Making good predictions from noisy data is a challenging problem. Methods

More information

Correlation: Copulas and Conditioning

Correlation: Copulas and Conditioning Correlation: Copulas and Conditioning This note reviews two methods of simulating correlated variates: copula methods and conditional distributions, and the relationships between them. Particular emphasis

More information

Probability Distributions and Estimation of Ali-Mikhail-Haq Copula

Probability Distributions and Estimation of Ali-Mikhail-Haq Copula Applied Mathematical Sciences, Vol. 4, 2010, no. 14, 657-666 Probability Distributions and Estimation of Ali-Mikhail-Haq Copula Pranesh Kumar Mathematics Department University of Northern British Columbia

More information

Partial Correlation with Copula Modeling

Partial Correlation with Copula Modeling Partial Correlation with Copula Modeling Jong-Min Kim 1 Statistics Discipline, Division of Science and Mathematics, University of Minnesota at Morris, Morris, MN, 56267, USA Yoon-Sung Jung Office of Research,

More information

Bayesian Methods for Machine Learning

Bayesian Methods for Machine Learning Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),

More information

THE MODELLING OF HYDROLOGICAL JOINT EVENTS ON THE MORAVA RIVER USING AGGREGATION OPERATORS

THE MODELLING OF HYDROLOGICAL JOINT EVENTS ON THE MORAVA RIVER USING AGGREGATION OPERATORS 2009/3 PAGES 9 15 RECEIVED 10. 12. 2007 ACCEPTED 1. 6. 2009 R. MATÚŠ THE MODELLING OF HYDROLOGICAL JOINT EVENTS ON THE MORAVA RIVER USING AGGREGATION OPERATORS ABSTRACT Rastislav Matúš Department of Water

More information

Theory and Methods of Statistical Inference. PART I Frequentist theory and methods

Theory and Methods of Statistical Inference. PART I Frequentist theory and methods PhD School in Statistics cycle XXVI, 2011 Theory and Methods of Statistical Inference PART I Frequentist theory and methods (A. Salvan, N. Sartori, L. Pace) Syllabus Some prerequisites: Empirical distribution

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo 1 Motivation 1.1 Bayesian Learning Markov Chain Monte Carlo Yale Chang In Bayesian learning, given data X, we make assumptions on the generative process of X by introducing hidden variables Z: p(z): prior

More information

The simple slice sampler is a specialised type of MCMC auxiliary variable method (Swendsen and Wang, 1987; Edwards and Sokal, 1988; Besag and Green, 1

The simple slice sampler is a specialised type of MCMC auxiliary variable method (Swendsen and Wang, 1987; Edwards and Sokal, 1988; Besag and Green, 1 Recent progress on computable bounds and the simple slice sampler by Gareth O. Roberts* and Jerey S. Rosenthal** (May, 1999.) This paper discusses general quantitative bounds on the convergence rates of

More information

Bayesian Estimation of Prediction Error and Variable Selection in Linear Regression

Bayesian Estimation of Prediction Error and Variable Selection in Linear Regression Bayesian Estimation of Prediction Error and Variable Selection in Linear Regression Andrew A. Neath Department of Mathematics and Statistics; Southern Illinois University Edwardsville; Edwardsville, IL,

More information

Gaussian Process Vine Copulas for Multivariate Dependence

Gaussian Process Vine Copulas for Multivariate Dependence Gaussian Process Vine Copulas for Multivariate Dependence José Miguel Hernández-Lobato 1,2 joint work with David López-Paz 2,3 and Zoubin Ghahramani 1 1 Department of Engineering, Cambridge University,

More information

Bivariate Flood Frequency Analysis Using Copula Function

Bivariate Flood Frequency Analysis Using Copula Function Bivariate Flood Frequency Analysis Using Copula Function Presented by : Dilip K. Bishwkarma (student,msw,ioe Pulchok Campus) ( Er, Department of Irrigation, GoN) 17 th Nov 2016 1 Outlines Importance of

More information

Comparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters

Comparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters Journal of Modern Applied Statistical Methods Volume 13 Issue 1 Article 26 5-1-2014 Comparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters Yohei Kawasaki Tokyo University

More information

Introduction to Markov Chain Monte Carlo & Gibbs Sampling

Introduction to Markov Chain Monte Carlo & Gibbs Sampling Introduction to Markov Chain Monte Carlo & Gibbs Sampling Prof. Nicholas Zabaras Sibley School of Mechanical and Aerospace Engineering 101 Frank H. T. Rhodes Hall Ithaca, NY 14853-3801 Email: zabaras@cornell.edu

More information

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals

More information

A Brief Introduction to Copulas

A Brief Introduction to Copulas A Brief Introduction to Copulas Speaker: Hua, Lei February 24, 2009 Department of Statistics University of British Columbia Outline Introduction Definition Properties Archimedean Copulas Constructing Copulas

More information

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1 Parameter Estimation William H. Jefferys University of Texas at Austin bill@bayesrules.net Parameter Estimation 7/26/05 1 Elements of Inference Inference problems contain two indispensable elements: Data

More information

A Geometric Interpretation of the Metropolis Hastings Algorithm

A Geometric Interpretation of the Metropolis Hastings Algorithm Statistical Science 2, Vol. 6, No., 5 9 A Geometric Interpretation of the Metropolis Hastings Algorithm Louis J. Billera and Persi Diaconis Abstract. The Metropolis Hastings algorithm transforms a given

More information

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur

Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur Probability Methods in Civil Engineering Prof. Dr. Rajib Maity Department of Civil Engineering Indian Institute of Technology, Kharagpur Lecture No. # 29 Introduction to Copulas Hello and welcome to this

More information

Markov Chain Monte Carlo methods

Markov Chain Monte Carlo methods Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning

More information

Convolution Based Unit Root Processes: a Simulation Approach

Convolution Based Unit Root Processes: a Simulation Approach International Journal of Statistics and Probability; Vol., No. 6; November 26 ISSN 927-732 E-ISSN 927-74 Published by Canadian Center of Science and Education Convolution Based Unit Root Processes: a Simulation

More information

Bayesian Inference for Conditional Copula models with Continuous and Binary Responses

Bayesian Inference for Conditional Copula models with Continuous and Binary Responses Bayesian Inference for Conditional Copula models with Continuous and Binary Responses Radu Craiu Department of Statistics University of Toronto Joint with Avideh Sabeti (Toronto) and Mian Wei (Toronto)

More information

Using copulas to model time dependence in stochastic frontier models

Using copulas to model time dependence in stochastic frontier models Using copulas to model time dependence in stochastic frontier models Christine Amsler Michigan State University Artem Prokhorov Concordia University November 2008 Peter Schmidt Michigan State University

More information

Principles of Bayesian Inference

Principles of Bayesian Inference Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters

More information

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements

Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Jeffrey N. Rouder Francis Tuerlinckx Paul L. Speckman Jun Lu & Pablo Gomez May 4 008 1 The Weibull regression model

More information

Likelihood-free MCMC

Likelihood-free MCMC Bayesian inference for stable distributions with applications in finance Department of Mathematics University of Leicester September 2, 2011 MSc project final presentation Outline 1 2 3 4 Classical Monte

More information

Dynamic System Identification using HDMR-Bayesian Technique

Dynamic System Identification using HDMR-Bayesian Technique Dynamic System Identification using HDMR-Bayesian Technique *Shereena O A 1) and Dr. B N Rao 2) 1), 2) Department of Civil Engineering, IIT Madras, Chennai 600036, Tamil Nadu, India 1) ce14d020@smail.iitm.ac.in

More information

Reminder of some Markov Chain properties:

Reminder of some Markov Chain properties: Reminder of some Markov Chain properties: 1. a transition from one state to another occurs probabilistically 2. only state that matters is where you currently are (i.e. given present, future is independent

More information

Bivariate Rainfall and Runoff Analysis Using Entropy and Copula Theories

Bivariate Rainfall and Runoff Analysis Using Entropy and Copula Theories Entropy 2012, 14, 1784-1812; doi:10.3390/e14091784 Article OPEN ACCESS entropy ISSN 1099-4300 www.mdpi.com/journal/entropy Bivariate Rainfall and Runoff Analysis Using Entropy and Copula Theories Lan Zhang

More information

Bayesian Inference for Pair-copula Constructions of Multiple Dependence

Bayesian Inference for Pair-copula Constructions of Multiple Dependence Bayesian Inference for Pair-copula Constructions of Multiple Dependence Claudia Czado and Aleksey Min Technische Universität München cczado@ma.tum.de, aleksmin@ma.tum.de December 7, 2007 Overview 1 Introduction

More information

K-ANTITHETIC VARIATES IN MONTE CARLO SIMULATION ISSN k-antithetic Variates in Monte Carlo Simulation Abdelaziz Nasroallah, pp.

K-ANTITHETIC VARIATES IN MONTE CARLO SIMULATION ISSN k-antithetic Variates in Monte Carlo Simulation Abdelaziz Nasroallah, pp. K-ANTITHETIC VARIATES IN MONTE CARLO SIMULATION ABDELAZIZ NASROALLAH Abstract. Standard Monte Carlo simulation needs prohibitive time to achieve reasonable estimations. for untractable integrals (i.e.

More information

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Review. DS GA 1002 Statistical and Mathematical Models.   Carlos Fernandez-Granda Review DS GA 1002 Statistical and Mathematical Models http://www.cims.nyu.edu/~cfgranda/pages/dsga1002_fall16 Carlos Fernandez-Granda Probability and statistics Probability: Framework for dealing with

More information

Bayesian Inference for Clustered Extremes

Bayesian Inference for Clustered Extremes Newcastle University, Newcastle-upon-Tyne, U.K. lee.fawcett@ncl.ac.uk 20th TIES Conference: Bologna, Italy, July 2009 Structure of this talk 1. Motivation and background 2. Review of existing methods Limitations/difficulties

More information

Markov Chain Monte Carlo

Markov Chain Monte Carlo Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).

More information

Tools for Parameter Estimation and Propagation of Uncertainty

Tools for Parameter Estimation and Propagation of Uncertainty Tools for Parameter Estimation and Propagation of Uncertainty Brian Borchers Department of Mathematics New Mexico Tech Socorro, NM 87801 borchers@nmt.edu Outline Models, parameters, parameter estimation,

More information

Optimal exact tests for complex alternative hypotheses on cross tabulated data

Optimal exact tests for complex alternative hypotheses on cross tabulated data Optimal exact tests for complex alternative hypotheses on cross tabulated data Daniel Yekutieli Statistics and OR Tel Aviv University CDA course 29 July 2017 Yekutieli (TAU) Optimal exact tests for complex

More information

First steps of multivariate data analysis

First steps of multivariate data analysis First steps of multivariate data analysis November 28, 2016 Let s Have Some Coffee We reproduce the coffee example from Carmona, page 60 ff. This vignette is the first excursion away from univariate data.

More information

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture

More information

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian Inference for DSGE Models. Lawrence J. Christiano Bayesian Inference for DSGE Models Lawrence J. Christiano Outline State space-observer form. convenient for model estimation and many other things. Bayesian inference Bayes rule. Monte Carlo integation.

More information

Multivariate Distribution Models

Multivariate Distribution Models Multivariate Distribution Models Model Description While the probability distribution for an individual random variable is called marginal, the probability distribution for multiple random variables is

More information

A simple graphical method to explore tail-dependence in stock-return pairs

A simple graphical method to explore tail-dependence in stock-return pairs A simple graphical method to explore tail-dependence in stock-return pairs Klaus Abberger, University of Konstanz, Germany Abstract: For a bivariate data set the dependence structure can not only be measured

More information

Multivariate Survival Data With Censoring.

Multivariate Survival Data With Censoring. 1 Multivariate Survival Data With Censoring. Shulamith Gross and Catherine Huber-Carol Baruch College of the City University of New York, Dept of Statistics and CIS, Box 11-220, 1 Baruch way, 10010 NY.

More information

BAYESIAN ESTIMATION OF LINEAR STATISTICAL MODEL BIAS

BAYESIAN ESTIMATION OF LINEAR STATISTICAL MODEL BIAS BAYESIAN ESTIMATION OF LINEAR STATISTICAL MODEL BIAS Andrew A. Neath 1 and Joseph E. Cavanaugh 1 Department of Mathematics and Statistics, Southern Illinois University, Edwardsville, Illinois 606, USA

More information

Reconstruction of individual patient data for meta analysis via Bayesian approach

Reconstruction of individual patient data for meta analysis via Bayesian approach Reconstruction of individual patient data for meta analysis via Bayesian approach Yusuke Yamaguchi, Wataru Sakamoto and Shingo Shirahata Graduate School of Engineering Science, Osaka University Masashi

More information

Item Parameter Calibration of LSAT Items Using MCMC Approximation of Bayes Posterior Distributions

Item Parameter Calibration of LSAT Items Using MCMC Approximation of Bayes Posterior Distributions R U T C O R R E S E A R C H R E P O R T Item Parameter Calibration of LSAT Items Using MCMC Approximation of Bayes Posterior Distributions Douglas H. Jones a Mikhail Nediak b RRR 7-2, February, 2! " ##$%#&

More information

A copula goodness-of-t approach. conditional probability integral transform. Daniel Berg 1 Henrik Bakken 2

A copula goodness-of-t approach. conditional probability integral transform. Daniel Berg 1 Henrik Bakken 2 based on the conditional probability integral transform Daniel Berg 1 Henrik Bakken 2 1 Norwegian Computing Center (NR) & University of Oslo (UiO) 2 Norwegian University of Science and Technology (NTNU)

More information

Modelling Dependence with Copulas and Applications to Risk Management. Filip Lindskog, RiskLab, ETH Zürich

Modelling Dependence with Copulas and Applications to Risk Management. Filip Lindskog, RiskLab, ETH Zürich Modelling Dependence with Copulas and Applications to Risk Management Filip Lindskog, RiskLab, ETH Zürich 02-07-2000 Home page: http://www.math.ethz.ch/ lindskog E-mail: lindskog@math.ethz.ch RiskLab:

More information

Sampling Methods (11/30/04)

Sampling Methods (11/30/04) CS281A/Stat241A: Statistical Learning Theory Sampling Methods (11/30/04) Lecturer: Michael I. Jordan Scribe: Jaspal S. Sandhu 1 Gibbs Sampling Figure 1: Undirected and directed graphs, respectively, with

More information

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix

Labor-Supply Shifts and Economic Fluctuations. Technical Appendix Labor-Supply Shifts and Economic Fluctuations Technical Appendix Yongsung Chang Department of Economics University of Pennsylvania Frank Schorfheide Department of Economics University of Pennsylvania January

More information