A Bayesian Approach for Sample Size Determination in Method Comparison Studies
|
|
- Theodora Lawrence
- 6 years ago
- Views:
Transcription
1 A Bayesian Approach for Sample Size Determination in Method Comparison Studies Kunshan Yin a, Pankaj K. Choudhary a,1, Diana Varghese b and Steven R. Goodman b a Department of Mathematical Sciences b Department of Molecular and Cell Biology University of Texas at Dallas Abstract Studies involving two methods for measuring a continuous response are regularly conducted in health sciences to evaluate agreement of a method with itself and agreement between methods. However, to our knowledge, no statistical literature is available on how to determine sample sizes for planning such studies when the goal is simultaneous evaluation of intra- and inter-method agreement. We develop a simulation-based Bayesian methodology for this problem in a hierarchical model framework. Unlike a frequentist approach, it takes into account of uncertainty in parameter estimates. The proposed methodology can be used with any scalar measure of agreement currently available in the literature. Its application is illustrated with two real examples one from proteomics and the other involving blood pressure measurement. Key Words: Agreement; Concordance correlation; Fitting and sampling priors; Hierarchical model; Sample size determination; Tolerance interval; Total deviation index. 1 Corresponding author. Address: EC 35, PO Box ; University of Texas at Dallas; Richardson, TX ; USA. pankaj@utdallas.edu; Tel: (972) ; Fax: (972)
2 1 Introduction Comparison of two methods for measuring a continuous response variable is a topic of considerable interest in health sciences research. In practice, a method may be an assay, a medical device, a measurement technique or a clinical observer. The variable of interest, e.g., blood pressure, heart rate, cardiac stroke volume, etc., is typically an indicator of the health of the individual being measured. Over 11,000 method comparison studies have been published in the biomedical literature in the past three decades. Such a study has two possible goals. The first is to determine the extent of agreement between two methods. If this inter-method agreement is sufficiently good, the methods may be used interchangeably or the use of the cheaper or the more convenient method may be preferred. The second goal is to evaluate the agreement of a method with itself. The extent of this intra-method agreement gives an idea of the repeatability of a method and serves as a baseline for evaluation of the inter-method agreement. The topic of how to assess agreement in two methods has received quite a bit of attention in the statistical literature. See Barnhart, Haber and Lin (2007) for a recent review of the literature on this topic. There are several measures of agreement such as limits of agreement (Bland and Altman, 1986), concordance correlation (Lin, 1989), mean squared deviation (Lin, 2000), coverage probability (Lin et al., 2002), and total deviation index or tolerance interval (Lin, 2000; Choudhary and Nagaraja, 2007) among others. On the other hand, the topic of how to plan a method comparison study in particular, the sample size determination (SSD), has not received the same level of attention. We are not aware of any literature on SSD when the interest lies in evaluating both intra- and inter-method agreement. This involves determining the number of replicate measurements in addition to the number of individuals for the study. Lin (1992), Lin, Whipple and Ho (1998), Lin et 2
3 al. (2002), and Choudhary and Nagaraja (2007) do provide frequentist sample size formulas associated with agreement measures of their interest. But they are restricted to just the evaluation of inter-method agreement. However, a simultaneous evaluation of both intraand inter-method agreement is crucial because the amount of agreement possible between two methods is limited by how much the methods agree with themselves. A new method, which is perfect and worthy of adoption, will not agree well with an established standard if the latter has poor agreement with itself (Bland and Altman, 1999). Our aim in this article is to develop a methodology for determining the number of individuals and the number of replicate measurements needed for a method comparison study to achieve a given level of precision of simultaneous inference regarding intra- and inter-method agreement. We take a Bayesian route to this SSD problem as it allows us to overcome an important limitation of a frequentist procedure. In the frequentist paradigm, one generally casts the problem of agreement evaluation in a hypothesis testing framework, and then derives an approximate sample size formula by performing an asymptotic power analysis on the lines of Lehmann (1998, Section 3.3). But this formula involves asymptotic standard error of the estimated agreement measure, which is a function of the unknown parameters in model. So typically one obtains their estimates, either from the literature or through a pilot study, and substitutes them in the SSD formula. However, by merely inserting their values in the formula no account is taken of the variability in the estimates. This issue can be addressed in a Bayesian paradigm. See, e.g., the review articles of Adcock (1997) and Wang and Gelfand (2002), and also the references therein. The proposed procedure can be used in conjunction with any scalar measure of agreement currently available in the literature. This article is organized as follows. In Section 2, we describe the SSD methodology. We investigate its properties in Section 3, and illustrate its application in Section 4 with two real 3
4 examples. Section 5 concludes with a discussion. All the computations reported here are performed using the software packages R (R Development Core Team, 2006) and WinBUGS (Spiegelhalter et al., 2003). We now introduce the two examples. In both cases, we treat the available data as pilot data and use the estimates obtained from them to determine sample sizes for a future study. These examples represent two qualitatively different method comparison studies. As shown in Section 3, in the first example, the measurements on the same individual, whether from the same method or two different methods, appear only weakly correlated. This scenario is somewhat unusual. In contrast, the measurements are highly correlated in the second example, which is the more typical scenario. These distinct scenarios allow us to get a better insight into the workings of the proposed methodology. Example 1 (Protein ratios): In this case, we are interested in SSD for a study to compare two softwares XPRESS of Han et al. (2001) and ASAPRatio of Li et al. (2003), for computing abundance ratios of proteins in a pair of blood samples. The former is labor intensive and requires much user input, whereas the latter is automated to a large extent. These softwares are used in proteomics for comparing protein profiles under two different conditions to discover proteins that may be differentially expressed. See, e.g., Liebler (2002) for a gentle introduction to proteomics. The response of specific interest to us is the protein abundance ratios in blood samples of a pair of healthy individuals. These ratios are expected to be near one since both samples come from healthy people. We have ratios of 8 proteins in blood samples of 5 pairs of individuals computed by the two softwares. Only a total of 60 ratios are available as not all proteins are observed in every sample. These data were collected in the Goodman lab (Department of Molecular and Cell Biology) at our institution to study the variation in the measurement technique. 4
5 Example 2 (Blood pressure): In this case, we consider the blood pressure data of Carrasco and Jover (2003). Here systolic blood pressure (in mmhg) is measured twice, each using two methods a manual mercury sphygmomanometer and an automatic device, OMRON 711, on a sample of 384 individuals. The manual device serves as a reference method and the automatic device is the test method. 2 SSD for method comparison studies Let y ijk, k = 1,..., n, j = 1, 2, i = 1,..., m, represent the k-th replicate measurement from the j-th method on the i-th individual. Here m is the number of individuals and n is the common number of replicate measurements from each method on every individual. A common number of replicates is assumed to simplify the SSD problem. Replicate measurements refer to multiple measurements taken under identical conditions. In particular, the underlying true measurement does not change during replication. Any SSD procedure depends on the model assumed for the data, the inference of interest, and a measure of precision of inference. So we first describe them briefly. 2.1 Modelling the data We assume that the data follow a hierarchical model y ijk β j, b i, b ij, σ 2 j independent N (β j + b i + b ij, σ 2 j ), b i ψ 2 independent N (0, ψ 2 ), b ij ψi 2 independent N (0, ψ2 I ), (1) where β j is the effect of the j-th method; b i is the effect of the i-th individual; and b ij is the individual method interaction. Oftentimes, there is no need for the interaction term b ij in (1). Moreover, when n = 1, this term is not identifiable and we drop it from the model. 5
6 To complete the Bayesian specification of this model, we assume the following mutually independent prior distributions: β j N (0, V 2 j ), ψ 2 IG(A ψ, B ψ ), ψ 2 I IG(A I, B I ), σ 2 j IG(A j, B j ), j = 1, 2; (2) where IG(A, B) represents an inverse gamma distribution its reciprocal follows a gamma distribution with mean A/B and variance A/B 2. All the hyperparameters are positive and need to be specified. Choosing large values for V 2 j and small values for A s and B s typically lead to noninformative priors. For β j, an alternative noninformative choice is an improper uniform distribution over the real line. It can also be thought of as taking V 2 j = in (2). The resulting posterior is proper for fixed hyperparameter values. These prior choices are quite standard in the Bayesian literature (Gelman et al., 2003, ch 15). See also Browne and Draper (2006) and Gelman (2006) for some recent suggestions for priors on variance parameters. The joint posterior of the parameters under (2) is not available in a closedform. So we use the Gibbs sampler a Markov chain Monte Carlo (MCMC) algorithm, described in the Appendix, to simulate draws from the posterior distribution. 2.2 Evaluation of agreement In what follows, we use γ as a generic notation for the vector of all relevant parameters without identifying them explicitly. They will be clear from the context. Let the random vectors (y 1, y 2 ) and (y j1, y j2 ) respectively denote the bivariate populations of measurements from the two methods on the same individual, and two replicate measurements from the j-th 6
7 method on the same individual. The model (1) postulates that y 1 γ N β 1 2, ψ2 + ψ I 2 + σ2 1 ψ2, y 2 β 2 ψ 2 ψ 2 + ψi 2 + σ2 2 y j1 γ N β j 2, ψ2 + ψ I 2 + σ2 j ψ 2 + ψi 2. (3) y j2 β j ψ 2 + ψi 2 ψ 2 + ψi 2 + σ2 j The data in (1) need not be paired for the first joint distribution to be defined. Next, let d 12 = y 1 y 2 and d jj = y j1 y j2 respectively denote the populations of inter-method differences and intra-method differences for the j-th method. It follows that d 12 γ N (µ 12 = β 1 β 2, τ 2 12 = 2ψ 2 I + σ σ 2 2), d jj γ N (0, τ 2 jj = 2σ 2 j ). (4) Let θ 12 be a measure of agreement between two methods and θ jj be the corresponding measure of intra-method agreement for the j-th method. The former is a function of parameters of the distribution of (y 1, y 2 ), whereas the latter is the same function of parameters of the distribution of (y j1, y j2 ). We assume that they are scalar and non-negative, possibly after a simple monotonic transformation, and their low values indicate good agreement. All the agreement measures mentioned in the introduction satisfy these criteria except the limits of agreement, which quantifies agreement using a lower limit and an upper limit. Let U 12 and U jj be upper credible bounds for θ 12 and θ jj, respectively, with (1 α) posterior probability. The agreement is considered adequate when these bounds are small. To compute them, we first fit the model (1) to the observed data with priors (2) and obtain a large number of draws from the joint posterior of model parameters. These draws are then used to simulate from the posteriors of θ 12 and θ jj. The (1 α)-th sample quantiles of draws of θ 12 and θ jj are taken as the bounds U 12 and U jj, respectively. 7
8 2.3 A measure of precision of inference Consider for now only the evaluation of inter-method agreement with the credible bound U 12 for θ 12. It satisfies P r(θ 12 U 12 y) = 1 α, where y represents the observed data. Although this condition ensures {θ 12 U 12 } with a high posterior probability, it says nothing about how the posterior density of θ 12 is distributed near U 12. In particular, if a large portion of this density is concentrated in a region far below U 12, then the inference is not very precise. Various suggestions have been made in the literature for measuring the precision of a credible interval (see Wang and Gelfand, 2002 for a summary). However, they are relevant only for a two-sided interval, and not an one-sided bound such as U 12. So below we develop a new measure of precision that is appropriate for upper bounds. Similar arguments can be used to develop a precision measure for lower bounds. Under certain regularity conditions, the posterior distribution of a population parameter converges to a point mass as m (Gelman et al., 2003, ch 4). This suggests that one can focus on the posterior probability P r(δu 12 θ 12 U 12 y), for a specified δ (0, 1), in search for a measure of precision. Further, considering that SSD takes place prior to data collection, our proposal is to take E { P r(δu 12 θ 12 U 12 y) } = 1 α E { F 12 y (δu 12 ) }, or equivalently T 12 (m, n, δ) = E { F 12 y (δu 12 ) }, (5) as the measure of precision of U 12. The expectation here is taken over the marginal distribution of y, and F 12 y is the posterior cumulative distribution function (cdf) of θ 12. A small value for T 12 indicates a high precision of inference since it represents the probability mass below δu 12. Also, for a fixed (m, n), T 12 is an increasing function of δ increasing from zero when δ = 0 to (1 α) when δ = 1. For the bound U jj, one can similarly take T jj (m, n, δ) = E { F jj y (δu jj ) }, j = 1, 2, (6) 8
9 as the measure of precision. We have T 11 = T 22 when identical prior distributions are used for parameters of the two methods. For T 12 and T jj to be useful in SSD, they should be decreasing functions of m and n, and should tend to zero as m, keeping everything else fixed. Analytically verifying the first property is hard, but simulation can be used to clarify the approximate monotonicity. The second property follows from a straightforward application of asymptotic properties of a posterior distribution. 2.4 Computing the precision measure The expectations T 12 and T jj are not available in closed-forms under model (1). So we use the simulation-based approach of Wang and Gelfand (2002) to compute them. Prior to their work, Bayesian SSD focussed mostly on normal and binomial one- and two-sample problems, partly because of a lack of a general approach for computing precision measures under hierarchical models, which are typically not available in closed-forms. Their key idea is to distinguish between a fitting prior and a sampling prior. A prior that is used to fit a model is what they call a fitting prior. Frequently, this prior is noninformative and sometimes even improper, provided the resulting posterior is proper. On the other hand, for SSD in practice, we are usually interested in achieving a particular level of precision when γ is concentrated in a relatively small portion of the parameter space. This uncertainty in γ is captured using what they call a sampling prior. This prior is necessarily proper and informative. One choice for this is a uniform distribution over a bounded interval, but other distributions can also be used. In our case with model (1), the fitting prior is given by (2). Examples of sampling priors appear in Section 3. Let γ denote a draw of γ from its sampling prior and y be a draw of y from [y γ ], where 9
10 [ ] denotes a probability distribution. This y alone is a draw from [y ] the marginal distribution of y under the sampling prior. By repeating this process we can obtain a large number of draws from [y ], say, yl, l = 1,..., L. Following Wang and Gelfand (2002), we compute the expectations T 12 in (5) and T jj in (6) with respect to [y ] and approximate them using Monte Carlo integration as T12(m, n, δ) = E { F 12 y (δu12) } 1 L Tjj(m, n, δ) = E { F jj y (δujj) } 1 L L F 12 y l (δu12) l=1 L F jj y l (δujj), j = 1, 2, (7) where U 12 and U jj are counterparts of U 12 and U jj when y is used in place of y. l=1 Notice here that the role of the sampling prior for γ is limited to generating y for averaging in (7). The posterior cdf s in these expressions are computed using the fitting prior for γ. Since they too do not have simple forms, we use the posterior draws obtained via MCMC to approximate them. For a given y l, F 12 y l (δu 12) is approximated by the proportion of posterior draws of θ 12 y l that are less than or equal to δu 12, where U 12 is the (1 α)-th sample quantile of all the draws. A similar strategy is used to approximate F jj y l (δu jj). As in the case of T 12 and T jj, the approximate monotonicity of T 12 and T jj with respect to m and n can be clarified using simulation. We investigate this in Section The SSD procedure We would like to find sample sizes (m, n) to make T12 and Tjj, defined in (7), sufficiently small. There may be several combinations of (m, n) that give the same precision of inference. So the cost of sampling must be taken into account to find the optimal combination. Let C I and C R respectively represent the fixed costs associated with sampling an individual and taking a replicate from the two methods. Thus the total cost of sampling n replicates from 10
11 both methods on each of m individuals is C T (m, n) = m(c I + nc R ), C I, C R > 0. (8) We now propose the following SSD strategy: Specify the sampling priors, a large δ (0, 1) and a small (1 β) (0, 1 α). Find (m, n) such that max { T 12(m, n, δ), T 11(m, n, δ), T 22(m, n, δ) } 1 β, (9) and C T (m, n) is minimized. The condition in (9) ensures that the posterior probabilities of the intervals [δu 12, U 12], [δu jj, U jj], j = 1, 2, are all at least (β α), providing sufficiently precise inference. In practice, one may take δ 0.75 and (1 β) Simulation study We now use simulation to verify the monotonicity of the averages in (7) for four agreement measures concordance correlation (Lin, 1989), coverage probability (Lin et al., 2002), mean squared deviation (Lin, 2000), and total deviation index (Lin, 2000; Choudhary and Nagaraja, 2007). They are defined as follows for evaluation of inter-method agreement. concordance correlation: 2cov(y 1, y 2 ) ( E(y1 ) E(y 2 ) ) 2 + var(y1 ) + var(y 2 ). coverage probability: P r( y 1 y 2 q 0 ) for a specified small q 0. mean squared deviation: E ( (y 1 y 2 ) 2). total deviation index: p 0 -th quantile of y 1 y 2 for a specified large p 0. The first one is based on the joint distribution of (y 1, y 2 ), whereas the others are based on the distribution of y 1 y 2. For evaluation of intra-method agreement of j-th method, these measures are defined simply by substituting (y j1, y j2 ) in place of (y 1, y 2 ). Table 1 gives 11
12 their expressions assuming distributions (3) and (4). The first two measures range between (0, 1) and their large values indicate good agreement. For the purpose of SSD, we transform them as θ 12 = log φ 12 and θ jj = log φ jj, where φ s are defined in Table 1, to satisfy the assumptions on θ s that they lie in (0, ) and their small values indicate good agreement. On the other hand, the last two measures range between (0, ) and their small values indicate good agreement. No transformation is needed in this case, i.e., θ 12 = φ 12 and θ jj = φ jj. It may be noted that the usual range of concordance correlation is ( 1, 1), but here they are restricted to be positive because cov(y 1, y 2 ) and cov(y j1, y j2 ) under model (1) are positive. Moreover, the intra-method versions of the last three measures depend on model parameters only through σj 2. For the simulation study, we focus on the two examples introduced in Section 1. First, we need to choose sampling priors for parameters in model (1). We now describe how we construct them by fitting appropriate models to the pilot data in the two examples. In the first example, we take the ASAP software as method 1 and the XPRESS software as method 2. After a preliminary analysis, we model the available data as y ijr β j, b i, protein r, σ 2 j independent N (β j + b i + protein r, σ 2 j ), b i independent N (0, ψ 2 ); i = 1,..., m = 5, j = 1, 2, r = 1,..., 8; where y ijr represents the ratio of r-th protein from j-th method on i-th blood sample pair. The interaction term b ij is not included here as the data do not have enough information to estimate it. To fit this model, we assume mutually independent (fitting) priors uniform ( 10 3, 10 3 ) for β s and protein effects, and IG(10 3, 10 3 ) for variance parameters. The model is fitted using WinBUGS package of Spiegelhalter et al. (2003) by calling it from R through the R2WinBUGS package of Sturtz, Ligges and Gelman (2005). Table 2 presents posterior summaries for (β 1, β 2, log σ1, 2 log σ2, 2 log ψ 2 ). Further, the 95% central credible intervals 12
13 for corr(y 1, y 2 ), corr(y 11, y 12 ) and corr(y 21, y 22 ), irrespective of the protein, are (0.01, 0.44), (0.01, 0.43) and (0.01, 0.51), respectively. These correlations are computed using (3). They are relatively small for a method comparison study because the between-individual variation ψ 2 here is small in comparison with the within-individual variations σj 2, j = 1, 2. Under this model, the agreement measures listed in Table 1 do not depend on proteins. So for the purpose of SSD for a future study, we ignore their effects and assume model (1). Further, to examine the effect of sampling priors on SSD, we consider two sets of sampling priors for (β 1, β 2, log σ1, 2 log σ2, 2 log ψ 2 ) independent normal distributions with means and variances equal to their posterior means and variances, and independent uniform distributions over their central 95% credible intervals. Moreover, since ψ 2 I could not be estimated in this small pilot study but might be present in a bigger study, we take the sampling prior for log ψi 2 to be the same as that of log ψ 2. In the second example, we take the manual device as method 1 and the automatic device as method 2. These data are modelled as (1) with (m, n) = (384, 2) but without the interaction term b ij. This term is dropped from the model on the basis of an exploratory analysis of the data and the deviance information criterion (Spiegelhalter et al., 2003). This model is fitted on the lines of previous model with the same (fitting) priors for β s and variances. Posterior summaries for (β 1, β 2, log σ1, 2 log σ2, 2 log ψ 2 ) are given in Table 2. The 95% central credible intervals for corr(y 1, y 2 ), corr(y 11, y 12 ) and corr(y 21, y 22 ) are (0.86, 0.90), (0.86, 0.90) and (0.85, 0.90), respectively. Such high correlations are typical in method comparison studies. In this case also, we consider two sampling priors for (β 1, β 2, log σ1, 2 log σ2, 2 log ψ 2 ) they are constructed using posterior summaries in exactly the same way as the previous example except that b ij term is not included in (1). This is because the present study, which itself is quite large, does not seem to suggest its need. 13
14 Our investigation of properties of averages in (7) focuses on the following settings: (α, δ) = (0.05, 0.80); fitting priors for β j s in (1) as independent (improper) uniform distributions on real line, and as independent IG(10 3, 10 3 ) distributions for variances; m {15, 20,..., 150}; and n {1, 2, 3, 4}. We restrict attention to n 4 as it is rare to find studies in literature with more than four replicates. We perform the following steps for each combination of example, sampling prior, agreement measure and the settings above: (i) Draw γ from the sampling prior. Then draw [y γ ] from model (1). (ii) Fit model (1) to the data y with above fitting priors, and simulate draws from the posterior of γ. The Gibbs sampler described in Appendix is used for posterior simulation. The Markov chain is run for 2,500 iterations and the first 500 iterations are discarded as burn-in. These numbers were determined through a convergence analysis of the simulated chain. (iii) Use the posterior draws to compute the credible bounds U 12 and U jj, j = 1, 2, for the agreement measure as described in Section 2.4. Also, approximate F 12 y (δu 12) and F jj y (δu jj), j = 1, 2, as described in this section. (iv) Repeat above steps 2,000 times and compute the averages T 12 and T jj, j = 1, 2, in (7). It took about 45 minutes on a Linux computer with 4 GB RAM to compute one set of the three averages. The values of T11 and T12 in case of normal sampling priors are plotted in Figure 1 for concordance correlation (both examples), in Figure 2 for total deviation index with p 0 = 0.8 (both examples), and in Figure 3 for mean squared deviation and coverage probability with q 0 = log(1.5) (only example 1). The omitted scenarios have qualitatively similar results. 14
15 The graphs confirm that the averages T11 and T12 can be considered decreasing functions of m and n. This is also true for T22 for which the results are not shown. Some of the curves do show minor departures from monotonicity, particularly in case of n = 1 when the curves drop rather slowly with m. But they are due to the Monte Carlo error involved in averaging and can be ignored. A slow decline also means that a precise evaluation of agreement is difficult even with a large m. For all agreement measures except concordance correlation, we have max{t12, T11, T22} = max{t11, T22} for all (m, n), in both examples. It implies that to achieve same precision of inference, as measured by (δ, β), these measures require a larger sample size for intra-method evaluation than for inter-method evaluation. This property also holds for concordance correlation in example 2, but it holds in example 1 only when n = 1. The Figures 1-3 also demonstrate that different sample sizes are needed for different agreement measures to attain the same precision of inference. All these conclusions for normal sampling priors also hold for uniform sampling priors (results not shown). A further comparison of the two priors reveals that they lead to virtually identical results in case of example 2. Even in case of example 1, their differences are generally small, about 2-3% at most. This suggests that for SSD, it does not matter much whether normal or uniform sampling priors are used as they lead to roughly the same results. 4 Illustration The results in previous section demonstrate, in particular, that the proposed SSD procedure in Section 2.5 can be used in conjunction with any of the four agreement measures listed in Table 1. In this section, we apply this procedure to determine sample sizes for the two examples introduced in Section 1. For simplicity, we focus only on the total deviation index 15
16 as the measure of agreement. We also investigate the impact of various choices one has to make for SSD based upon this measure. Other agreement measures can be handled similarly. The method for computing T s, and the sampling and fitting priors remain the same as in the previous section. We also take (α, β, δ) = (0.05, 0.85, 0.8). Our choice of δ = 0.80 is motivated by bioequivalence studies (Berger and Hsu, 1996; Wang, 1999) where a difference of about 20% is considered practically equivalent. Table 3 presents combinations of (m, n), 1 n 4, that individually satisfy T12 = 1 β, Tjj = 1 β, j = 1, 2. In other words, we have at least β α = 0.8 individual probabilities in the intervals [0.8U12, U12], [0.8Ujj, Ujj], j = 1, 2, where the upper bounds have 0.95 credible probabilities. The results are presented for both normal and uniform sampling priors, and for p 0 {0.80, 0.85, 0.90, 0.95}. They are computed by performing a grid search, and some of them can be determined from Figure 2. As we expect from simulation studies, larger sample sizes are needed for intra-method evaluation than for inter-method evaluation. Moreover, substantially large values of m are needed for intra-method evaluation with n = 1. The values of m with uniform sampling priors tend to be a bit higher than those with normal sampling priors. They differ by at most two except in the case of intra-method evaluation with n = 1. The sample sizes for inter-method evaluation also appear quite robust to the choice of p 0 sometimes the values of m do increase by one as p 0 increases to the next level. But their maximum difference over the four settings of p 0 is three and most of the times the differences are two or less. The sample sizes for intra-method evaluation do not depend on p 0 since the term involving p 0 in total deviation index appears as a multiplicative constant (see Table 1). It is also worth pointing out that, with the exception of n = 1 case, the sample sizes for our two very different examples show remarkable similarity. In case of n = 1, the value of m for intra- 16
17 method evaluation is substantially higher for the second example than the first example. This is because the sampling priors in the second example induce high correlations between the measurements, whereas they induce only weak correlations in the first example. It follows from the results in Table 3 and (9) that the desired precision of simultaneous evaluation of intra- and inter-method agreement can be achieved by any of the following (m, n) combinations: (121, 1), (58, 2), (34, 3) and (23, 4) for example 1; and (655, 1), (60, 2), (33, 3) and (23, 4) for example 2. These values are based on uniform sampling priors and do not depend on p 0. To find the optimal combination, recall that the cost of sampling is given by (8), where we expect that the relative cost (C R /C I ) (0, 1]. Under this cost function, a combination (m 2, n 2 ) is better than (m 1, n 1 ) provided (m 2 m 1 ) < (C R /C I )(m 1 n 1 m 2 n 2 ). In particular, if (m 1 n 1 m 2 n 2 ) > 0 and (m 2 m 1 ) < 0, then the (m 2, n 2 ) combination is better than (m 1, n 1 ) irrespective of the cost of sampling. A naive application of this criterion to the two examples, without any subject-matter knowledge, suggests that a higher n is better, and hence (23, 4) is the optimal sample size choice in both cases. In case of example 1, the experimental protocol does allow us to draw enough blood from individuals to have four replications. So we decide to use (m, n) = (23, 4) as the sample size for a future study. In case of example 2, however, it is unlikely that four replicate measurements of blood pressure can be obtained from each method without a change in the blood pressure. On the other hand, we do need replications, particularly for evaluating intra-method agreement. So, as a compromise, we take (m, n) = (60, 2) for a future study. There is no change in these sample sizes if normal sampling priors are used instead of uniform sampling priors. Further, to examine the impact of hyperparameters of variances in fitting priors (2), we repeat the determination of optimal (m, n) combination with 10 1 and 10 2 as the common hyperparameter value. We find that m changes by at most one indicating 17
18 that the sample sizes are not sensitive to their choice, provided they are small. 5 Discussion In this article, we described a Bayesian SSD approach for planning a method comparison study. It is conceptually straightforward and can be used with any scalar agreement measure. However, a disadvantage of this approach is that it is simulation-based and hence is computationally intensive. Nevertheless the computations are easy to program in popular software packages R and WinBUGS. An R program that can be used for SSD is publicly available on the website This approach involves specifying sampling priors for parameters. Information regarding them can be obtained from the literature or through a pilot study as we do here. Simulation results show that one does not need good approximations for the distributions since the resulting sample sizes are robust to their choice. The sample sizes are also robust to the choice of variance hyperparameters in the fitting priors if they are small. The proposed approach also involves making rather subjective choices for (δ, β) that control how much precision of inference is desired. But, in a rough sense, their roles here are similar to the roles of effect size and desired power, which also need to be specified in the usual frequentist SSD approach. In both examples, among the (m, n) combinations that yield the same precision, the combination with higher n happened to be more cost efficient, at least in the range of n investigated. This was true irrespective of what the cost of taking a replicate was relative to the cost of sampling an individual. This conclusion is not true in general. It is easy to construct examples where a higher n is more cost efficient only when the relative cost of sampling a replicate is small. 18
19 References Adcock, C. J. (1997). Sample size determination: a review. The Statistician 46, Barnhart, H. X., Haber, M. J. and Lin, L. I. (2007). An overview on assessing agreement with continuous measurement. Journal of Biopharmaceutical Statistics. (to appear). Berger, R. L. and Hsu, J. C. (1996). Bioequivalence trials, intersection-union tests and equivalence confidence sets. Statistical Science 11, Bland, J. M. and Altman, D. G. (1999). Measuring agreement in method comparison studies. Statistical Methods in Medical Research 8, Browne, W. J. and Draper, D. (2006). A comparison of Bayesian and likelihood-based methods for fitting multilevel models. Bayesian Analysis 1, Carrasco, J. L. and Jover, L. (2003). Estimating the generalized concordance correlation coefficient through variance components. Biometrics 59, Choudhary, P. K. and Nagaraja, H. N. (2007). Tests for assessment of agreement using probability criteria. Journal of Statistical Planning and Inference 137, Gelman, A. (2006). Prior distributions for variance parameters in hierarchical models. Bayesian Analysis 1, Han, D. K., Eng, J., Zhou, H. and Aebersold, R. (2001). Quantitative profiling of differentiation-induced microsomal proteins using isotope-coded affinity tags and mass spectrometry. Nature Biotechnology 19, Lehmann, E. L. (1998). Elements of Large Sample Theory. Springer-Verlag, New York. 19
20 Li, X.-J., Zhang, H., Ranish, J. R. and Aebersold, R. (2003). Automated statistical analysis of protein abundance ratios from data generated by stable isotope dilution and tandem mass spectrometry. Analytical Chemistry 75, Liebler, D. C. (2002). Introduction to Proteomics: Tools for the New Biology. Humana Press, Totowa, NJ. Lin, L. I. (1989). A concordance correlation coefficient to evaluate reproducibility. Biometrics 45, Corrections: 2000, 56: Lin, L. I. (1992). Assay validation using the concordance correlation coefficient. Biometrics 48, Lin, L. I. (2000). Total deviation index for measuring individual agreement with applications in laboratory performance and bioequivalence. Statistics in Medicine 19, Lin, L. I., Hedayat, A. S., Sinha, B. and Yang, M. (2002). Statistical methods in assessing agreement: Models, issues, and tools. Journal of the American Statistical Association 97, Lin, S. C., Whipple, D. M. and Ho, C. S. (1998). Evaluation of statistical equivalence using limits of agreement and associated sample size calculation. Communications in Statistics, Part A Theory and Methods 27, R Development Core Team (2006). R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing. Vienna, Austria. Ruppert, D., Wand, M. P. and Carroll, R. J. (2003). Semiparametric Regression. Cambridge University Press, New York. 20
21 Spiegelhalter, D. J., Best, N. G., Carlin, B. P. and van der Linde, A. (2003). Bayesian measures of model complexity and fit (with discussion). Journal of the Royal Statistical Society, Series B 64, Spiegelhalter, D. J., Thomas, A., Best, N. G. and Lunn, D. (2003). WinBUGS Version 1.4 User Manual. Sturtz, S., Ligges, U. and Gelman, A. (2005). R2WinBUGS: A Package for Running Win- BUGS from R. Journal of Statistical Software 12, Wang, F. and Gelfand, A. E. (2002). A simulation-based approach to Bayesian sample size determination for performance under a given model and for separating models. Statistical Science 17, Wang, W. (1999). On equivalence of two variances of a bivariate normal vector. Journal of Statistical Planning and Inference 81, Appendix: Gibbs sampler for posterior simulation The hierarchical model (1) can be written as y (β, b, R) N (Xβ + Zb, R), b G N (0, G), where X and Z are appropriately chosen design matrices consisting of zeros and ones; y = (y 1,..., y m ) is the data vector; β = (β 1, β 2 ) ; b = (b 1,..., b m, b 11, b 12,..., b m1, b m2 ) ; R = diag{r 1,..., R m } is the conditional covariance matrix of y; and G = diag{ψ 2 I m, ψi 2I 2m} is the covariance matrix of b. Here I m denotes an m m identity matrix, and y i and R i are respectively a column vector and a covariance matrix of order 2n, given as y i = (y i11,..., y i1n, y i21,..., y i2n ), R i = diag{σ 2 1I n, σ 2 2I n }. 21
22 The parameters in this case are (β, b, ψ 2, ψ 2 I, σ2 1, σ 2 2). It is well-known that this model is conditionally conjugate under the priors (2) (Ruppert, Wand and Carroll, 2003, ch 16). In particular, the full conditional distribution the conditional distribution of a parameter given the remaining parameters and y, of the column vector (β, b) is N ( {C R 1 C + D} 1 C R 1 y, {C R 1 C + D} 1), where C = [X, Z], D = diag{b 1, G 1 }, and B = diag{v 2 1, V 2 2 }. The matrices involved in this distribution can be computed using a QR decomposition. The full conditionals of variance parameters are the following independent inverse gamma distributions: ( ψ 2 IG A ψ + m/2, B ψ + ) ( b 2 i /2 ; ψi 2 IG A I + m, B I + i i ( σj 2 IG A j + (mn)/2, B j + ) (y ijk β j b i b ij ) 2 /2 i k j, j = 1, 2. ) b 2 ij/2 ; The Gibbs sampler algorithm simulates a Markov chain whose limiting distribution is the desired joint posterior of parameters (Gelman et al., 2003, ch 11). It iterates the following two steps until convergence: First, use the current draw of (ψ 2, ψ 2 I, σ2 1, σ 2 2) to sample from the normal full conditional of (β, b). Then, use the draw of (β, b) in the previous step to sample from the independent inverse gamma full conditionals of (ψ 2, ψ 2 I, σ2 1, σ 2 2). The estimates of variance parameters from a frequentist mixed model fit can be used as the starting points. 22
23 List of Tables and Figures Table 1 Expressions for various agreement measures for inter-method evaluation and intra-method evaluation of j-th method, j = 1, 2. They are derived using (3) and (4) under model (1). Here Φ( ) represents a N (0, 1) cdf; and χ 2 1(p 0, ) represents the p 0 -th quantile of a chi-squared distribution with one degree of freedom and non-centrality parameter. Table 2 Posterior summaries for selected parameters in the protein ratios data (example 1) and the blood pressure data (example 2). Table 3 Values of (m, n) that give the desired precision of agreement evaluation. Here 1-1, 2-2 and 1-2 respectively indicate intra-method agreement of method 1, method 2, and inter-method agreement. Results are presented assuming total deviation index is used as the measure of agreement with various common choices for p 0. The sample sizes for intra-method evaluation do not depend on p 0. Figure 1 Average probabilities T11 and T12 when agreement is measured using concordance correlation. These probabilities measure precisions of inter-method agreement and intra-method agreement of method 1, respectively. Here n represents the number of replicates and results are presented for both examples. Figure 2 Average probabilities T 11 and T 12 when agreement is measured using total deviation index with p 0 = Here n represents the number of replicates and results are presented for both examples. Figure 3 Average probabilities T11 and T12 when agreement is measured using mean squared deviation (top) and coverage probability (bottom) with q 0 = log(1.5). Here n represents the number of replicates and results are presented only for example 1. 23
24 measure inter-method (φ 12 ) intra-method (φ jj ) concordance correlation 2ψ 2 µ ψ2 +2ψI 2+σ2 1 +σ2 2 ψ 2 +ψi 2 ψ 2 +ψi 2+σ2 j coverage probability Φ( q 0 µ 12 τ 12 ) Φ( q 0 µ 12 τ 12 ) Φ( q 0 τ jj ) Φ( q 0 τ jj ) mean squared deviation µ τ 2 12 τ 2 jj total deviation index τ 12 {χ 2 1(p 0, µ 2 12/τ 2 12)} 1/2 τ jj {χ 2 1(p 0, 0)} 1/2 Table 1: Expressions for various agreement measures for inter-method evaluation and intramethod evaluation of j-th method, j = 1, 2. They are derived using (3) and (4) under model (1). Here Φ( ) represents a N (0, 1) cdf; and χ 2 1(p 0, ) represents the p 0 -th quantile of a chi-squared distribution with one degree of freedom and non-centrality parameter. 24
25 Protein ratio data Blood pressure data mean sd 2.5% 50% 75% 97.5% mean sd 2.5% 97.5% β β log σ log σ log ψ Table 2: Posterior summaries for selected parameters in the protein ratios data (example 1) and the blood pressure data (example 2). 25
26 Normal n Uniform n agreement p Example 1 (protein ratios) 1-1 n/a n/a Example 2 (blood pressure) 1-1 n/a n/a Table 3: Values of (m, n) that give the desired precision of agreement evaluation. Here 1-1, 2-2 and 1-2 respectively indicate intra-method agreement of method 1, method 2, and inter-method agreement. Results are presented assuming total deviation index is used as the measure of agreement with various common choices for p 0. The sample sizes for intra-method evaluation do not depend on p 0. 26
27 concordance correlation (example 1) concordance correlation (example 1) average probability (Tstar.11) n = 1 n = 2 n = 3 n = 4 average probability (Tstar.12) n = 1 n = 2 n = 3 n = number of individuals (m) number of individuals (m) concordance correlation (example 2) concordance correlation (example 2) average probability (Tstar.11) n = 1 n = 2 n = 3 n = 4 average probability (Tstar.12) n = 1 n = 2 n = 3 n = number of individuals (m) number of individuals (m) Figure 1: Average probabilities T 11 and T 12 when agreement is measured using concordance correlation. These probabilities measure precisions of inter-method agreement and intramethod agreement of method 1, respectively. Here n represents the number of replicates and results are presented for both examples. 27
28 total deviation index (example 1) total deviation index (example 1) average probability (Tstar.11) n = 1 n = 2 n = 3 n = 4 average probability (Tstar.12) n = 1 n = 2 n = 3 n = number of individuals (m) number of individuals (m) total deviation index (example 2) total deviation index (example 2) average probability (Tstar.11) n = 1 n = 2 n = 3 n = 4 average probability (Tstar.12) n = 1 n = 2 n = 3 n = number of individuals (m) number of individuals (m) Figure 2: Average probabilities T 11 and T 12 when agreement is measured using total deviation index with p 0 = Here n represents the number of replicates and results are presented for both examples. 28
29 mean squared deviation (example 1) mean squared deviation (example 1) average probability (Tstar.11) n = 1 n = 2 n = 3 n = 4 average probability (Tstar.12) n = 1 n = 2 n = 3 n = number of individuals (m) number of individuals (m) coverage probability (example 1) coverage probability (example 1) average probability (Tstar.11) n = 1 n = 2 n = 3 n = 4 average probability (Tstar.12) n = 1 n = 2 n = 3 n = number of individuals (m) number of individuals (m) Figure 3: Average probabilities T 11 and T 12 when agreement is measured using mean squared deviation (top) and coverage probability (bottom) with q 0 = log(1.5). Here n represents the number of replicates and results are presented only for example 1. 29
Bayesian and Frequentist Methodologies for Analyzing Method Comparison Studies With Multiple Methods
Bayesian and Frequentist Methodologies for Analyzing Method Comparison Studies With Multiple Methods Pankaj K. Choudhary a,1 and Kunshan Yin b a University of Texas at Dallas, Richardson, TX 75080, USA
More informationA Tolerance Interval Approach for Assessment of Agreement in Method Comparison Studies with Repeated Measurements
A Tolerance Interval Approach for Assessment of Agreement in Method Comparison Studies with Repeated Measurements Pankaj K. Choudhary 1 Department of Mathematical Sciences, University of Texas at Dallas
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters
More informationDefault Priors and Effcient Posterior Computation in Bayesian
Default Priors and Effcient Posterior Computation in Bayesian Factor Analysis January 16, 2010 Presented by Eric Wang, Duke University Background and Motivation A Brief Review of Parameter Expansion Literature
More informationPenalized Loss functions for Bayesian Model Choice
Penalized Loss functions for Bayesian Model Choice Martyn International Agency for Research on Cancer Lyon, France 13 November 2009 The pure approach For a Bayesian purist, all uncertainty is represented
More informationBayesian Networks in Educational Assessment
Bayesian Networks in Educational Assessment Estimating Parameters with MCMC Bayesian Inference: Expanding Our Context Roy Levy Arizona State University Roy.Levy@asu.edu 2017 Roy Levy MCMC 1 MCMC 2 Posterior
More informationA Flexible Class of Models for Data Arising from a thorough QT/QTc study
A Flexible Class of Models for Data Arising from a thorough QT/QTc study Suraj P. Anand and Sujit K. Ghosh Department of Statistics, North Carolina State University, Raleigh, North Carolina, USA Institute
More informationBayesian Methods for Machine Learning
Bayesian Methods for Machine Learning CS 584: Big Data Analytics Material adapted from Radford Neal s tutorial (http://ftp.cs.utoronto.ca/pub/radford/bayes-tut.pdf), Zoubin Ghahramni (http://hunch.net/~coms-4771/zoubin_ghahramani_bayesian_learning.pdf),
More informationBayesian Linear Regression
Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective
More informationBayesian Meta-analysis with Hierarchical Modeling Brian P. Hobbs 1
Bayesian Meta-analysis with Hierarchical Modeling Brian P. Hobbs 1 Division of Biostatistics, School of Public Health, University of Minnesota, Mayo Mail Code 303, Minneapolis, Minnesota 55455 0392, U.S.A.
More informationAssessing the Effect of Prior Distribution Assumption on the Variance Parameters in Evaluating Bioequivalence Trials
Georgia State University ScholarWorks @ Georgia State University Mathematics Theses Department of Mathematics and Statistics 8--006 Assessing the Effect of Prior Distribution Assumption on the Variance
More informationThe Bayesian Approach to Multi-equation Econometric Model Estimation
Journal of Statistical and Econometric Methods, vol.3, no.1, 2014, 85-96 ISSN: 2241-0384 (print), 2241-0376 (online) Scienpress Ltd, 2014 The Bayesian Approach to Multi-equation Econometric Model Estimation
More information(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis
Summarizing a posterior Given the data and prior the posterior is determined Summarizing the posterior gives parameter estimates, intervals, and hypothesis tests Most of these computations are integrals
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods By Oleg Makhnin 1 Introduction a b c M = d e f g h i 0 f(x)dx 1.1 Motivation 1.1.1 Just here Supresses numbering 1.1.2 After this 1.2 Literature 2 Method 2.1 New math As
More informationBayesian Methods in Multilevel Regression
Bayesian Methods in Multilevel Regression Joop Hox MuLOG, 15 september 2000 mcmc What is Statistics?! Statistics is about uncertainty To err is human, to forgive divine, but to include errors in your design
More informationMCMC algorithms for fitting Bayesian models
MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models
More informationIntroduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016
Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016 EPSY 905: Intro to Bayesian and MCMC Today s Class An
More informationeqr094: Hierarchical MCMC for Bayesian System Reliability
eqr094: Hierarchical MCMC for Bayesian System Reliability Alyson G. Wilson Statistical Sciences Group, Los Alamos National Laboratory P.O. Box 1663, MS F600 Los Alamos, NM 87545 USA Phone: 505-667-9167
More informationBUGS Bayesian inference Using Gibbs Sampling
BUGS Bayesian inference Using Gibbs Sampling Glen DePalma Department of Statistics May 30, 2013 www.stat.purdue.edu/~gdepalma 1 / 20 Bayesian Philosophy I [Pearl] turned Bayesian in 1971, as soon as I
More informationBayesian philosophy Bayesian computation Bayesian software. Bayesian Statistics. Petter Mostad. Chalmers. April 6, 2017
Chalmers April 6, 2017 Bayesian philosophy Bayesian philosophy Bayesian statistics versus classical statistics: War or co-existence? Classical statistics: Models have variables and parameters; these are
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee and Andrew O. Finley 2 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationBayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units
Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More information7. Estimation and hypothesis testing. Objective. Recommended reading
7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing
More informationBayesian vs frequentist techniques for the analysis of binary outcome data
1 Bayesian vs frequentist techniques for the analysis of binary outcome data By M. Stapleton Abstract We compare Bayesian and frequentist techniques for analysing binary outcome data. Such data are commonly
More informationR-squared for Bayesian regression models
R-squared for Bayesian regression models Andrew Gelman Ben Goodrich Jonah Gabry Imad Ali 8 Nov 2017 Abstract The usual definition of R 2 (variance of the predicted values divided by the variance of the
More informationHierarchical Linear Models
Hierarchical Linear Models Statistics 220 Spring 2005 Copyright c 2005 by Mark E. Irwin The linear regression model Hierarchical Linear Models y N(Xβ, Σ y ) β σ 2 p(β σ 2 ) σ 2 p(σ 2 ) can be extended
More informationWeb Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.
Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D. Ruppert A. EMPIRICAL ESTIMATE OF THE KERNEL MIXTURE Here we
More informationModel comparison: Deviance-based approaches
Model comparison: Deviance-based approaches Patrick Breheny February 19 Patrick Breheny BST 701: Bayesian Modeling in Biostatistics 1/23 Model comparison Thus far, we have looked at residuals in a fairly
More informationBayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence
Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns
More informationThe STS Surgeon Composite Technical Appendix
The STS Surgeon Composite Technical Appendix Overview Surgeon-specific risk-adjusted operative operative mortality and major complication rates were estimated using a bivariate random-effects logistic
More informationComputational statistics
Computational statistics Markov Chain Monte Carlo methods Thierry Denœux March 2017 Thierry Denœux Computational statistics March 2017 1 / 71 Contents of this chapter When a target density f can be evaluated
More informationBayesian data analysis in practice: Three simple examples
Bayesian data analysis in practice: Three simple examples Martin P. Tingley Introduction These notes cover three examples I presented at Climatea on 5 October 0. Matlab code is available by request to
More informationInvestigation into the use of confidence indicators with calibration
WORKSHOP ON FRONTIERS IN BENCHMARKING TECHNIQUES AND THEIR APPLICATION TO OFFICIAL STATISTICS 7 8 APRIL 2005 Investigation into the use of confidence indicators with calibration Gerard Keogh and Dave Jennings
More informationNONLINEAR APPLICATIONS OF MARKOV CHAIN MONTE CARLO
NONLINEAR APPLICATIONS OF MARKOV CHAIN MONTE CARLO by Gregois Lee, B.Sc.(ANU), B.Sc.Hons(UTas) Submitted in fulfilment of the requirements for the Degree of Doctor of Philosophy Department of Mathematics
More informationThe lmm Package. May 9, Description Some improved procedures for linear mixed models
The lmm Package May 9, 2005 Version 0.3-4 Date 2005-5-9 Title Linear mixed models Author Original by Joseph L. Schafer . Maintainer Jing hua Zhao Description Some improved
More informationMarkov Chain Monte Carlo
Markov Chain Monte Carlo Recall: To compute the expectation E ( h(y ) ) we use the approximation E(h(Y )) 1 n n h(y ) t=1 with Y (1),..., Y (n) h(y). Thus our aim is to sample Y (1),..., Y (n) from f(y).
More informationDiscussion of Maximization by Parts in Likelihood Inference
Discussion of Maximization by Parts in Likelihood Inference David Ruppert School of Operations Research & Industrial Engineering, 225 Rhodes Hall, Cornell University, Ithaca, NY 4853 email: dr24@cornell.edu
More informationBayesian Analysis of Latent Variable Models using Mplus
Bayesian Analysis of Latent Variable Models using Mplus Tihomir Asparouhov and Bengt Muthén Version 2 June 29, 2010 1 1 Introduction In this paper we describe some of the modeling possibilities that are
More informationPackage lmm. R topics documented: March 19, Version 0.4. Date Title Linear mixed models. Author Joseph L. Schafer
Package lmm March 19, 2012 Version 0.4 Date 2012-3-19 Title Linear mixed models Author Joseph L. Schafer Maintainer Jing hua Zhao Depends R (>= 2.0.0) Description Some
More informationMetropolis-Hastings Algorithm
Strength of the Gibbs sampler Metropolis-Hastings Algorithm Easy algorithm to think about. Exploits the factorization properties of the joint probability distribution. No difficult choices to be made to
More informationBayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida
Bayesian Statistical Methods Jeff Gill Department of Political Science, University of Florida 234 Anderson Hall, PO Box 117325, Gainesville, FL 32611-7325 Voice: 352-392-0262x272, Fax: 352-392-8127, Email:
More informationThe Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations
The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations John R. Michael, Significance, Inc. and William R. Schucany, Southern Methodist University The mixture
More informationBayesian Inference for Regression Parameters
Bayesian Inference for Regression Parameters 1 Bayesian inference for simple linear regression parameters follows the usual pattern for all Bayesian analyses: 1. Form a prior distribution over all unknown
More informationBayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007
Bayesian inference Fredrik Ronquist and Peter Beerli October 3, 2007 1 Introduction The last few decades has seen a growing interest in Bayesian inference, an alternative approach to statistical inference.
More informationThe International Journal of Biostatistics. A Unified Approach for Nonparametric Evaluation of Agreement in Method Comparison Studies
An Article Submitted to The International Journal of Biostatistics Manuscript 1235 A Unified Approach for Nonparametric Evaluation of Agreement in Method Comparison Studies Pankaj K. Choudhary University
More informationBayes: All uncertainty is described using probability.
Bayes: All uncertainty is described using probability. Let w be the data and θ be any unknown quantities. Likelihood. The probability model π(w θ) has θ fixed and w varying. The likelihood L(θ; w) is π(w
More informationPart 6: Multivariate Normal and Linear Models
Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of
More informationMonte Carlo Integration using Importance Sampling and Gibbs Sampling
Monte Carlo Integration using Importance Sampling and Gibbs Sampling Wolfgang Hörmann and Josef Leydold Department of Statistics University of Economics and Business Administration Vienna Austria hormannw@boun.edu.tr
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationMarkov Chain Monte Carlo methods
Markov Chain Monte Carlo methods Tomas McKelvey and Lennart Svensson Signal Processing Group Department of Signals and Systems Chalmers University of Technology, Sweden November 26, 2012 Today s learning
More informationMcGill University. Department of Epidemiology and Biostatistics. Bayesian Analysis for the Health Sciences. Course EPIB-675.
McGill University Department of Epidemiology and Biostatistics Bayesian Analysis for the Health Sciences Course EPIB-675 Lawrence Joseph Bayesian Analysis for the Health Sciences EPIB-675 3 credits Instructor:
More informationBest Linear Unbiased Prediction: an Illustration Based on, but Not Limited to, Shelf Life Estimation
Libraries Conference on Applied Statistics in Agriculture 015-7th Annual Conference Proceedings Best Linear Unbiased Prediction: an Illustration Based on, but Not Limited to, Shelf Life Estimation Maryna
More informationSubject CS1 Actuarial Statistics 1 Core Principles
Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and
More informationST 740: Markov Chain Monte Carlo
ST 740: Markov Chain Monte Carlo Alyson Wilson Department of Statistics North Carolina State University October 14, 2012 A. Wilson (NCSU Stsatistics) MCMC October 14, 2012 1 / 20 Convergence Diagnostics:
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public
More informationSpatial Statistics Chapter 4 Basics of Bayesian Inference and Computation
Spatial Statistics Chapter 4 Basics of Bayesian Inference and Computation So far we have discussed types of spatial data, some basic modeling frameworks and exploratory techniques. We have not discussed
More informationTests for Assessment of Agreement Using Probability Criteria
Tests for Assessment of Agreement Using Probability Criteria Pankaj K. Choudhary Department of Mathematical Sciences, University of Texas at Dallas Richardson, TX 75083-0688; pankaj@utdallas.edu H. N.
More informationComparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters
Journal of Modern Applied Statistical Methods Volume 13 Issue 1 Article 26 5-1-2014 Comparison of Three Calculation Methods for a Bayesian Inference of Two Poisson Parameters Yohei Kawasaki Tokyo University
More informationPractical Bayesian Quantile Regression. Keming Yu University of Plymouth, UK
Practical Bayesian Quantile Regression Keming Yu University of Plymouth, UK (kyu@plymouth.ac.uk) A brief summary of some recent work of us (Keming Yu, Rana Moyeed and Julian Stander). Summary We develops
More informationDiscussion on Fygenson (2007, Statistica Sinica): a DS Perspective
1 Discussion on Fygenson (2007, Statistica Sinica): a DS Perspective Chuanhai Liu Purdue University 1. Introduction In statistical analysis, it is important to discuss both uncertainty due to model choice
More informationBayesian Inference. Chapter 4: Regression and Hierarchical Models
Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School
More informationFAV i R This paper is produced mechanically as part of FAViR. See for more information.
Bayesian Claim Severity Part 2 Mixed Exponentials with Trend, Censoring, and Truncation By Benedict Escoto FAV i R This paper is produced mechanically as part of FAViR. See http://www.favir.net for more
More informationParametric Empirical Bayes Methods for Microarrays
Parametric Empirical Bayes Methods for Microarrays Ming Yuan, Deepayan Sarkar, Michael Newton and Christina Kendziorski April 30, 2018 Contents 1 Introduction 1 2 General Model Structure: Two Conditions
More informationBayesian Inference. Chapter 4: Regression and Hierarchical Models
Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationCSC 2541: Bayesian Methods for Machine Learning
CSC 2541: Bayesian Methods for Machine Learning Radford M. Neal, University of Toronto, 2011 Lecture 3 More Markov Chain Monte Carlo Methods The Metropolis algorithm isn t the only way to do MCMC. We ll
More informationMarkov Chain Monte Carlo (MCMC) and Model Evaluation. August 15, 2017
Markov Chain Monte Carlo (MCMC) and Model Evaluation August 15, 2017 Frequentist Linking Frequentist and Bayesian Statistics How can we estimate model parameters and what does it imply? Want to find the
More informationBayesian Inference: Probit and Linear Probability Models
Utah State University DigitalCommons@USU All Graduate Plan B and other Reports Graduate Studies 5-1-2014 Bayesian Inference: Probit and Linear Probability Models Nate Rex Reasch Utah State University Follow
More informationSTAT 425: Introduction to Bayesian Analysis
STAT 425: Introduction to Bayesian Analysis Marina Vannucci Rice University, USA Fall 2017 Marina Vannucci (Rice University, USA) Bayesian Analysis (Part 2) Fall 2017 1 / 19 Part 2: Markov chain Monte
More informationAnalysis of Regression and Bayesian Predictive Uncertainty Measures
Analysis of and Predictive Uncertainty Measures Dan Lu, Mary C. Hill, Ming Ye Florida State University, dl7f@fsu.edu, mye@fsu.edu, Tallahassee, FL, USA U.S. Geological Survey, mchill@usgs.gov, Boulder,
More informationKatsuhiro Sugita Faculty of Law and Letters, University of the Ryukyus. Abstract
Bayesian analysis of a vector autoregressive model with multiple structural breaks Katsuhiro Sugita Faculty of Law and Letters, University of the Ryukyus Abstract This paper develops a Bayesian approach
More informationBayesian modelling. Hans-Peter Helfrich. University of Bonn. Theodor-Brinkmann-Graduate School
Bayesian modelling Hans-Peter Helfrich University of Bonn Theodor-Brinkmann-Graduate School H.-P. Helfrich (University of Bonn) Bayesian modelling Brinkmann School 1 / 22 Overview 1 Bayesian modelling
More informationThe Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.
Christian P. Robert The Bayesian Choice From Decision-Theoretic Foundations to Computational Implementation Second Edition With 23 Illustrations ^Springer" Contents Preface to the Second Edition Preface
More informationCase Study in the Use of Bayesian Hierarchical Modeling and Simulation for Design and Analysis of a Clinical Trial
Case Study in the Use of Bayesian Hierarchical Modeling and Simulation for Design and Analysis of a Clinical Trial William R. Gillespie Pharsight Corporation Cary, North Carolina, USA PAGE 2003 Verona,
More informationStatistical Practice
Statistical Practice A Note on Bayesian Inference After Multiple Imputation Xiang ZHOU and Jerome P. REITER This article is aimed at practitioners who plan to use Bayesian inference on multiply-imputed
More informationPart 8: GLMs and Hierarchical LMs and GLMs
Part 8: GLMs and Hierarchical LMs and GLMs 1 Example: Song sparrow reproductive success Arcese et al., (1992) provide data on a sample from a population of 52 female song sparrows studied over the course
More informationRonald Christensen. University of New Mexico. Albuquerque, New Mexico. Wesley Johnson. University of California, Irvine. Irvine, California
Texts in Statistical Science Bayesian Ideas and Data Analysis An Introduction for Scientists and Statisticians Ronald Christensen University of New Mexico Albuquerque, New Mexico Wesley Johnson University
More informationBayesian Analysis of RR Lyrae Distances and Kinematics
Bayesian Analysis of RR Lyrae Distances and Kinematics William H. Jefferys, Thomas R. Jefferys and Thomas G. Barnes University of Texas at Austin, USA Thanks to: Jim Berger, Peter Müller, Charles Friedman
More informationHypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006
Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312 This draft, April 20, 2006 1 1 A Brief Review of Hypothesis Testing and Its Uses values and pure significance tests (R.A. Fisher)
More informationAdvanced Statistical Modelling
Markov chain Monte Carlo (MCMC) Methods and Their Applications in Bayesian Statistics School of Technology and Business Studies/Statistics Dalarna University Borlänge, Sweden. Feb. 05, 2014. Outlines 1
More informationTechnical Vignette 5: Understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models
Technical Vignette 5: Understanding intrinsic Gaussian Markov random field spatial models, including intrinsic conditional autoregressive models Christopher Paciorek, Department of Statistics, University
More informationBAYESIAN MODEL CRITICISM
Monte via Chib s BAYESIAN MODEL CRITICM Hedibert Freitas Lopes The University of Chicago Booth School of Business 5807 South Woodlawn Avenue, Chicago, IL 60637 http://faculty.chicagobooth.edu/hedibert.lopes
More informationDIC: Deviance Information Criterion
(((( Welcome Page Latest News DIC: Deviance Information Criterion Contact us/bugs list WinBUGS New WinBUGS examples FAQs DIC GeoBUGS DIC (Deviance Information Criterion) is a Bayesian method for model
More informationCHAPTER 10. Bayesian methods. Geir Storvik
CHAPTER 10 Bayesian methods Geir Storvik 10.1 Introduction Statistical inference concerns about learning from data, either parameters (estimation) or some, typically future, variables (prediction). In
More informationBayesian Model Diagnostics and Checking
Earvin Balderama Quantitative Ecology Lab Department of Forestry and Environmental Resources North Carolina State University April 12, 2013 1 / 34 Introduction MCMCMC 2 / 34 Introduction MCMCMC Steps in
More informationA note on Reversible Jump Markov Chain Monte Carlo
A note on Reversible Jump Markov Chain Monte Carlo Hedibert Freitas Lopes Graduate School of Business The University of Chicago 5807 South Woodlawn Avenue Chicago, Illinois 60637 February, 1st 2006 1 Introduction
More informationWhy Bayesian approaches? The average height of a rare plant
Why Bayesian approaches? The average height of a rare plant Estimation and comparison of averages is an important step in many ecological analyses and demographic models. In this demonstration you will
More informationA Note on Bayesian Inference After Multiple Imputation
A Note on Bayesian Inference After Multiple Imputation Xiang Zhou and Jerome P. Reiter Abstract This article is aimed at practitioners who plan to use Bayesian inference on multiplyimputed datasets in
More informationBayesian Estimation for the Generalized Logistic Distribution Type-II Censored Accelerated Life Testing
Int. J. Contemp. Math. Sciences, Vol. 8, 2013, no. 20, 969-986 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/10.12988/ijcms.2013.39111 Bayesian Estimation for the Generalized Logistic Distribution Type-II
More informationA noninformative Bayesian approach to domain estimation
A noninformative Bayesian approach to domain estimation Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 glen@stat.umn.edu August 2002 Revised July 2003 To appear in Journal
More informationContents. Part I: Fundamentals of Bayesian Inference 1
Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationPARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.
PARAMETER ESTIMATION: BAYESIAN APPROACH. These notes summarize the lectures on Bayesian parameter estimation.. Beta Distribution We ll start by learning about the Beta distribution, since we end up using
More informationAppendix: Modeling Approach
AFFECTIVE PRIMACY IN INTRAORGANIZATIONAL TASK NETWORKS Appendix: Modeling Approach There is now a significant and developing literature on Bayesian methods in social network analysis. See, for instance,
More informationHierarchical models. Dr. Jarad Niemi. August 31, Iowa State University. Jarad Niemi (Iowa State) Hierarchical models August 31, / 31
Hierarchical models Dr. Jarad Niemi Iowa State University August 31, 2017 Jarad Niemi (Iowa State) Hierarchical models August 31, 2017 1 / 31 Normal hierarchical model Let Y ig N(θ g, σ 2 ) for i = 1,...,
More informationBiol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference Fall 2016
Biol 206/306 Advanced Biostatistics Lab 12 Bayesian Inference Fall 2016 By Philip J. Bergmann 0. Laboratory Objectives 1. Learn what Bayes Theorem and Bayesian Inference are 2. Reinforce the properties
More informationBayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features. Yangxin Huang
Bayesian Inference on Joint Mixture Models for Survival-Longitudinal Data with Multiple Features Yangxin Huang Department of Epidemiology and Biostatistics, COPH, USF, Tampa, FL yhuang@health.usf.edu January
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationA Heteroscedastic Measurement Error Model for Method Comparison Data With Replicate Measurements
A Heteroscedastic Measurement Error Model for Method Comparison Data With Replicate Measurements Lakshika S. Nawarathna Department of Statistics and Computer Science University of Peradeniya Peradeniya
More information