OBJECTIVE PRIORS FOR THE BIVARIATE NORMAL MODEL. BY JAMES O. BERGER 1 AND DONGCHU SUN 2 Duke University and University of Missouri-Columbia
|
|
- Matilda Marsh
- 6 years ago
- Views:
Transcription
1 The Annals of Statistics 008, Vol. 36, No., DOI: 0.4/07-AOS50 Institute of Mathematical Statistics, 008 OBJECTIVE PRIORS FOR THE BIVARIATE NORMAL MODEL BY JAMES O. BERGER AND DONGCHU SUN Duke University and University of Missouri-Columbia Study of the bivariate normal distribution raises the full range of issues involving objective Bayesian inference, including the different types of objective priors (e.g., Jeffreys, invariant, reference, matching, the different modes of inference (e.g., Bayesian, frequentist, fiducial and the criteria involved in deciding on optimal objective priors (e.g., ease of computation, frequentist performance, marginalization paradoxes. Summary recommendations as to optimal objective priors are made for a variety of inferences involving the bivariate normal distribution. In the course of the investigation, a variety of surprising results were found, including the availability of objective priors that yield exact frequentist inferences for many functions of the bivariate normal parameters, including the correlation coefficient.. Introduction and prior distributions... Notation and problem statement. The bivariate normal distribution of (x,x has mean parameters μ = (μ,μ and covariance matrix ( σ = ρσ σ ρσ σ σ, where ρ is the correlation between x and x. The density is πσ σ ρ { exp σ (x μ + σ (x μ ρσ σ (x μ (x μ σ σ ( ρ The data consists of an independent random sample X = (x k = (x k,x k, k =,...,nof size n 3, for which the sufficient statistics are ( x n ( x = and S = (x x k x(x k x s = r s s ( r, s s s k= }. Supported in part by NSF Grant DMS Supported in part by NSF Grant SES and NIH Grant R0-MH0748. AMS 000 subject classifications. Primary 6F0, 6F5, 6F5; secondary 6A0, 6E5, 6H0, 6H0. Key words and phrases. Reference priors, matching priors, Jeffreys priors, right-haar prior, fiducial inference, frequentist coverage, marginalization paradox, rejection sampling, constructive posterior distributions. 963
2 964 J. O. BERGER AND D. SUN where, for i, j =,, x i = n n n x ij, s ij = (x ik x i (x jk x j and r = s. s s j= k= We will denote prior densities as π(μ,μ,σ σ,ρ, and the corresponding posterior densities as π(μ,μ,σ σ,ρ X (all with respect to dμ dμ dσ dσ dρ. We consider objective inference for parameters of the bivariate normal distribution and functions of these parameters, with special focus on development of objective confidence or credible sets. Section. introduces many of the key issues to be covered, through a summary of some of the most interesting results involving priors yielding exact frequentist procedures; this section also raises interesting historical and philosophical issues. For easy access, Section.3 presents our summary recommendations as to which priors to utilize. Often, the posteriors for the recommended priors are essentially available in computational closed form, allowing direct Monte Carlo simulation. Section provides simple accept-reject schemes for computing with the recommended priors in other cases. Sections 3 and 4 develop the needed theory, concerning what are called reference priors and matching priors, respectively, and also present various simulations that were conducted to enable summary recommendations to be made. Notation: In addition to (μ,μ,σ,σ,ρ, the following parameters will be considered: η = ρ (, η =, η 3 =, σ σ ρ σ ρ (3 (4 (5 (6 (7 θ = ρσ, σ θ = σ ( ρ, θ 3 =σ σ ( ρ, θ 4 = σ ρ, σ θ 5 = μ σ, θ 6 = σ σ, θ 7 = σ σ, θ 8 = μ σ, θ 9 σ = ρσ σ, θ 0 = σ + σ ρσ σ, θ = d d [d = (d,d not proportional to (0, ], λ = ch max (, λ = ch min (. Some of these parameters have straightforward statistical interpretations. Since (x x, μ, N(μ + θ (x μ, θ, it is clear that θ is a regression coefficient, θ is a conditional variance, and η is the corresponding precision. For the marginal distribution of x, η is the precision and θ 5 is the reciprocal of the
3 BIVARIATE NORMAL 965 coefficient of variation. θ 3 is usually called the generalized variance. (η,η,η 3 gives a type of Cholesky decomposition of the precision matrix [see (3 in Section.]. θ 0 is the variance of x x,andθ is the variance of d x + d x. Finally, λ and λ are the largest and smallest eigenvalues of. Technical issue. We will assume that ρ < and r < in virtually all expressions and results that follow. This is because, if either equals in absolute value, then ρ ={sign of r} with probability (either frequentist or Bayesian posterior, as relevant. Indeed, the situation then essentially collapses to the univariate version of the problem, which is standard... Matching, constructive posteriors and fiducial distributions. The bivariate normal distribution has been extensively studied from frequentist, fiducial and objective Bayesian perspectives. Table summarizes a number of interesting results. For a variety of parameters, it presents objective priors (discussed below for which the resulting Bayesian posterior credible sets of level α are also exact frequentist confidence sets at the same level; in this case, the priors are said to be exact frequentist matching. This is a very desirable situation: see [3]and[] for general discussion and the many earlier references. For μ,μ,σ,σ and ρ, the constructive posterior distributions are also the fiducial distributions for the parameters, as found in Fisher [4, 5] and[]. Posterior distributions are presented as constructive random distributions, that is, by a description of how to simulate from them. Thus to simulate from the posterior distribution of σ, given the data (actually, only s is needed, one draws independent χn random variables and simply computes the corresponding s /χn ; this yields an independent sample from the fiducial/posterior distribution of σ. Table also lists the objective prior distributions that yield the indicated objective posterior. The notation π ab in the table stands for the important class of prior densities (a subclass of the generalized Wishart distributions of [8] (8 π ab (μ,μ,σ,σ,ρ= σ 3 a σ b ( ρ. b/ Special cases of this class are the Jeffreys-rule prior π J = π 0,theright-Haar prior π H = π,theindependence Jeffreys prior π IJ = π = σ σ ( ρ 3/ and π RO which has a = b =. The independence Jeffreys prior follows from using a constant prior for the means, and then the Jeffreys prior for the covariance matrix with means given. We highlight the results about ρ in Table because they are interesting from practical, historical and philosphical perspectives. First, it does not seem to be
4 966 J. O. BERGER AND D. SUN TABLE Parameters with exact matching priors of the form π ab, and associated constructive posteriors: Here Z is a standard normal random variable, and χn and χ n are chi-squared random variables with the indicated degrees of freedom, all random variables being independent. For μ,μ,σ,σ and ρ, the indicated posteriors are also fiducial distributions Parameter Prior Posterior μ π b, b (including π J and π H x + Z χ μ π J = π 0 x + Z χn s n n s n d ( μ μ, d R π J = π 0 and π H (see Table 4 d (x, x + Z σ π b, b (including π J and π H ρ π H = π ψ( Z + χn ρ η 3 = σ ρ π a, a (including π H θ = ρσ σ π a, a (including π H θ = σ ( ρ π a, a (including π H χ n s χn χn χ d Sd n r r n ψ(y= y/ + y Z s r s s Z χn s χ n s ( r χ n S θ 3 = π H = π and π IJ = π θ 4 = σ ρ σ π H = π θ 5 = μ σ π b, b (including π J and π H d d π J = π 0 and π H (see Table 4 χ n χ n χn χ n r r r s s s ( r s χ n Z + x n s d Sd χn known that the indicated prior for ρ is exact frequentist matching (proved here in Theorem. Indeed, standard statistical software utilizes various approximations to arrive at frequentist confidence sets for ρ, missing the fact that a simple exact confidence set exists, even for n = 3. It was, of course, known that exact frequentist confidence procedures could be constructed (cf. Exercise 54, Chapter 6 of [8], but explicit expressions do not seem to be available. The historically interesting aspect of this posterior for ρ is that it is also the fiducial distribution of ρ. Geisser and Cornfield [6] studied the question of whether the fiducial distribution of ρ could be reproduced as an objective Bayesian posterior, and they concluded that this was most likely not possible. The strongest evidence for this arose from Brillinger [7], which used results from [9]andadifficult analytic argument to show that there does not exist a prior π(ρ such that
5 BIVARIATE NORMAL 967 the fiducial density of ρ equals f(r ρπ(ρ, wheref(r ρ is the density of r given ρ. Since the fiducial distribution of ρ only depends on r, it was certainly reasonable to speculate that if it were not possible to derive this distribution from the density of r and a prior, then it would not be possible to do so in general. The above result, of course, shows that this speculation was incorrect. The philosophically interesting aspect of this situation is that Brillinger s result does show that the fiducial/posterior distribution for ρ provides another example of the marginalization paradox ([3]. This leads to an interesting philosophical conundrum of a type that we have not previously seen: a complete fiducial/objective Bayesian/frequentist unification can be obtained for inference about ρ, but only if violation of the marginalization paradox is accepted. We will shortly introduce a prior distribution that avoids the marginalization paradox for ρ, but which is not exactly frequentist matching. We know of no way to adjudicate between the competing goals of exact frequentist matching and avoidance of the marginalization paradox, and so will simply present both as possible objective Bayesian approaches. (Note that the same conundrum also arises for θ 5 = μ /σ ; the exact frequentist matching prior results in a marginalization paradox, as shown in [4]. Some interesting examples of improper priors resulting in marginalization paradox can be found from Ghosh and Yang [7] and Datta and Ghosh [0, ]..3. Recommended priors. It is actually rare to have exact matching priors for parameters of interest. Also, one is often interested in very complex functions of parameters (e.g., predictive distributions and/or joint distributions of parameters. For such problems it is important to have a general objective prior that seems to perform reasonably well for all quantities of interest. Furthermore, it is unappealing to many Bayesians to change the prior according to which parameter is declared to be of interest, and an objective prior that performs well overall is often sought. The five priors we recommend for various purposes are π J, π H, (9 π Rρ σ σ ( ρ, π + ρ Rσ σ σ ( ρ and (0 π Rλ. σ σ ( ρ (σ /σ σ /σ + 4ρ The first prior in (9 was developed in [0] and was studied extensively in [], where it was shown to be a one-at-a-time reference prior (see Section 3. The second prior in (9 is new and is derived in Section 3. π Rλ wasdevelopedasa one-at-a-time reference prior in [5]. With these definitions, we can make our summary recommendations. Table gives the four objective priors that are recommended for use, and indicates for
6 968 J. O. BERGER AND D. SUN TABLE Recommendations of objective priors for various parameters in the bivariate normal model: indicates that the posterior will not be exact frequentist matching.(for μ and parameters with σ replaced by σ, use the right-haar prior with the variances interchanged. Prior Parameter σ π Rρ ρ, σ, general use π H μ, σ, ρ, η 3, ρσ σ, σ ( ρ,, σ σ ρ, μ σ π H (see Table 4 d (μ,μ, d d π Rλ ch max ( π Rσ σ = ρσ σ which parameters (or functions thereof they are recommended. These recommendations are based on three criteria: (i the degree of frequentist matching, discussed in Section 4; (ii being a one-at-a-time reference prior, discussed in Section 3;and (iii ease of computation. The rationale for each of the entries in the table, based on these criteria, is given in Section 4.5. Another commonly used prior is the scale prior, π S (σ σ. The motivation that is often given for this prior is that it is standard to use σi as the prior for a standard deviation σ i, while <ρ< is on a bounded set and so one can use a constant prior in ρ. We do not recommend this prior, but do consider its performance in Section Computation. In this paper, a constant prior is always used for (μ,μ, so that (( μ, (( x ( X N μ,n. x Generation from this conditional posterior distribution is standard, so the challenge of simulation from the posterior distribution requires only sampling from (σ,σ,ρ X. The marginal likelihood of (σ,σ,ρsatisfies ( L (σ,σ,ρ (n / exp ( trace(s. It is immediate that, under the priors π J and π IJ, the marginal posteriors of are Inverse Wishart (S,nand Inverse Wishart (S,n, respectively. Berger, Strawderman and Tang [4] gave a Metropolis Hastings algorithm to generate from (σ,σ,ρ X based on the prior π Rλ. The following sections deal with the other priors we consider.
7 BIVARIATE NORMAL 969 TABLE 3 Ratio π/π IJ, upper bound M, rejection step and acceptance probability for ρ = 0.80, 0.95, 0.99, when π = π Rρ, π Rσ, π Rσ,π S and π MS Bound Acceptance probability Prior Ratio π π IJ M Rejection Step ρ = 0.80 ρ = 0.95 ρ = 0.99 π Rρ ρ u ρ π Rσ ρ 4 u ρ ρ π Rσ ( ρ ρ u ρ π S ( ρ 3/ u ( ρ 3/ Marginal posteriors of (σ,σ,ρ under π Rρ, π Rσ, π Rσ, and π S. For these priors, an independent sample from π(σ,σ,ρ X can be obtained by the following acceptance-rejection algorithm: Simulation step. Generate (σ,σ,ρ from the independence Jeffreys posterior π IJ (σ,σ,ρ X [the Inverse Wishart (S,n distribution] and, independently, sample u Uniform(0,. Rejection step. Suppose M sup (σ,σ,ρ π(σ,σ,ρ π IJ (σ,σ,ρ <. If u π(σ,σ,ρ/ [Mπ IJ (σ,σ,ρ], accept (σ,σ,ρ; else, return to Simulation step. For each of the priors listed in Table 3, the key ratio, π/π IJ, is listed in the table, along with the upper bound M,the Rejection step and the resulting acceptance probability for ρ = 0.80, 0.95, The rejection algorithm is quite efficient for sampling these posteriors. Indeed, for ρ 0, the algorithms accept with probability near one and, even for large ρ, the acceptance probabilities are very reasonable for the priors π Rρ, π Rσ, and π Rσ.Forlarge ρ, the algorithm is less efficient for the posteriors under the prior π S, but even these acceptance rates may well be fine in practice, given the simplicity of the algorithm... Computation under π ab. The most interesting prior of this form (besides the Jeffreys and independence Jeffreys priors is the right-haar prior π H, although other priors such as π arise as reference priors, and hence are potentially of interest. While Table gave an explicit form for the most important marginal posteriors arising from priors of this form, it is of considerable interest that essentially closed form generation from the full posterior of any prior of this form is possible (see, e.g., [8]. This is briefly reviewed in this section, since the expressions for the resulting constructive posteriors are needed for later results on frequentist coverage. It is most convenient to work with the parameters (η,η,η 3 given in (. This parameterization gives a type of Cholesky decomposition of the precision
8 970 J. O. BERGER AND D. SUN matrix, ( ( η η = 3 η 0 (3, 0 η η 3 η which accounts for the simplicity of ensuing computations. Note that ( is equivalent to σ = η, σ = + η 3 η 3 (4, ρ =. η η η η + η 3 The prior π ab of (8 for(μ,μ,σ,σ,ρ transforms to the extended conjugate class of priors for (μ,μ,η,η,η 3,givenbyπ ab (μ,μ,η,η,η 3 = η a η b. LEMMA. Consider the prior π ab. (a The marginal posterior of η 3 given (η,η ; X is N( η r s /s, /s. (b The marginal posterior distributions of η and η are independent and (η X Gamma( (n a, s ; (η X Gamma( (n b, s ( r. See [5] for a proof of this result. We next present the constructive posteriors of (η,η,η 3, and from these derive the constructive posteriors of (μ,μ,σ,σ,ρ and other parameters. All results follow directly from Lemma and (4. In presenting the constructive posteriors, we will use a star to represent a random draw from the implied distribution; thus μ will represent a random draw from its posterior distribution, Z,Z,Z 3 will be independent draws from the standard normal distribution, and and χ n b will be independent draws from chi-squared distributions with the indicated degrees of freedom. Recall that these constructive posteriors are not only useful for simulation, but will be the key to proving exact frequentist matching results. FACT. (a The constructive posterior of (η,η,η 3 given X can be expressed as (5 η =, η s = η 3 = Z 3 s χn b s χ n b s ( r, r r.
9 BIVARIATE NORMAL 97 (b The constructive posterior of (σ,σ,ρgiven X can be expressed as (6 (7 (8 σ = s, σ s = ( r χn b ρ = ψ(y, Y = Z ( Z 3 χn b χ n b χ n a r r, r r, where ψ(x= x/ + x. (c The constructive posterior for μ and μ can be written (9 (0 μ = x + Z s n, μ = x + Z r s ( Z + n χ n b Z 3 χn b Z χ n a s ( r. n 3. Reference priors. This paper began with an effort to derive and catalogue the possible reference priors for the bivariate normal distribution. The reference prior theory (cf. Bernardo [6] and Berger and Bernardo [3] has arguably been the most successful technique for deriving objective priors. Reference priors depend on (i specification of a parameter of interest; (ii specification of nuisance parameters; (iii specification of a grouping of parameters; and (iv ordering of the groupings. These are all conveyed by the shorthand notation used in Table 4. Thus, {(μ,μ, (σ,σ,ρ} indicates that (μ,μ is the parameter of interest, with the others being nuisance parameters, and there are two groupings with the indicated ordering. (The resulting reference prior is the independence Jeffreys prior, π IJ. As another example, {λ,λ,ϑ,μ,μ } introduces the eigenvalues λ >λ of as being primarily of interest, with ϑ (the angle defining the orthogonal matrix that diagonalizes, μ and μ being the nuisance parameters. Based on experience with numerous examples, the reference priors that are typically judged to be best are one-at-a-time reference priors, in which each parameter is listed separately as its own group. Hence we will focus on these priors. It turns out to be the case that, for the one-at-a-time reference priors, the ordering of μ and μ among the variables is irrelevant. Hence if μ and μ are omitted from a listing in Table 4, the resulting reference prior is to be viewed as any one-at-a-time reference prior with the indicated ordering of other variables, with the μ i being inserted anywhere in the ordering.
10 97 J. O. BERGER AND D. SUN TABLE 4 Reference priors for the bivariate normal model (where μ = d (μ,μ, ( σ = θ 7, ρ = d (0, /(σ θ7, θ = σ [ ( ρ ] and θ = ρσ / σ ; {{ }} indicates that any ordering of the parameters yields the same reference prior Prior π(μ,μ,σ,σ,ρ For parameter ordering Has form (8with π J σ σ ( ρ {(μ,μ,σ,σ,ρ} (a, b = (, 0 π IJ σ σ ( ρ 3/ {(μ,μ, (σ,σ,ρ} (a, b = (, π Rρ σ σ ( ρ {ρ,σ,σ }, {θ 7,θ 6,ρ} π Rσ +ρ σ σ ( ρ {σ,σ,ρ} π Rσ {σ σ σ ( ρ ρ,ρ,σ } {σ,η 3,θ } π RO σ σ ( ρ 3/ {σ,θ,η 3 } (a, b = (, π Rλ [((σ /σ (σ /σ +4ρ ] / σ σ ( ρ π H σ ( ρ π H d μ dμ d σ dσ d ρ ( σ [ ( ρ ] {λ,λ,ϑ} {{σ,θ,θ }}, {{θ,θ 3,θ 4 }} (a, b = (, {{η,η,θ }}, {{η,θ,θ }} {{d (μ,μ,μ,θ, θ, θ }} We are interested in finding one-at-a-time reference priors for the parameters μ,μ,σ,σ,ρ, η 3, θ,...,θ 9 and λ. This is done in [5], with the results summarized in Table 4, for all these parameters (i.e., the parameter appears as the first entry in the parameter ordering except η 3, σ,andμ i /σ i ; finding one-at-a-time reference priors for these parameters is technically challenging. (We do not explicitly list the reference priors for σ in the table, since they can be found by simply switching with σ in the various expressions. 4. Comparisons of priors via frequentist matching. 4.. Frequentist coverage probabilities and exact matching. Suppose a posterior distribution is used to create one-sided credible intervals (θ L,θ α (X, where θ L is the lower limit in the relevant parameter space and θ α (X is the posterior quantile of the parameter θ of interest, defined by P(θ < θ α (X X = α. (Here θ is the random variable. Of interest is the frequentist coverage of the corresponding confidence interval, that is, C(μ,μ,σ,σ,ρ = P(θ <θ α (X μ,μ,σ,σ,ρ.(herex is the random variable. The closer C(μ,μ,σ,σ,ρ is to the nominal α, the better the procedure (and corresponding objective prior is judged to be. The main results about exact matching are given in Theorems through 8. The proofs of Theorems, and 8 are given in Section 5; the rest can be found in [5].
11 BIVARIATE NORMAL 973 The following technical lemmas will be repeatedly utilized. The first lemma is from (3d..8 in []. Lemma 3 is easy. LEMMA. For n 3 and given σ,σ,ρ, the following three random variables are independent and have the indicated distributions: [ ] s / [ r s T = σ ( ρσ ] ( Z 3 (standard normal, ρ s σ ( (3 T 3 = s ( r σ ( ρ χ n, T 5 = s σ χ n. LEMMA 3. Let Y α denote the α quantile of any random variable Y. (a If g( is a monotonically increasing function, [g(y] α = g(y α for any α (0,. (b If W is a positive random variable, (WY α 0 if and only if Y α 0. We will reserve quantile notation for posterior quantiles, with respect to the distributions. Thus the quantile [(σ Z3 rz 3/χn + ρ s χn b ] α would be computed based on the joint distribution of (Z3,χ n b, while holding (σ,ρ,r,s,z 3,χn fixed. 4.. Credible intervals for a class of functions of (σ,σ,ρ. We consider the one-sided credible intervals of σ,σ and ρ and some functions of the form (4 θ = σ d σ d g(ρ, for d,d R and some function g(. We also consider a class of scale-invariant priors (5 π(μ,μ,σ,σ,ρ h(ρ, for some c,c R and a positive function h. σ c σ c THEOREM. Denote the α posterior quantile of θ by θ α (X under the prior (5. For any fixed (μ,μ,σ,σ,ρ, the frequentist coverage of the credible interval (θ L,θ α (X depends only on ρ. Here θ L is the lower boundary of the parameter space for θ. Note that parameters ρ, η,η,η 3, θ,...,θ 4 are all functions of the form (4. From Theorem, under any of the priors π J,π IJ,π Rσ,π Rρ,π RO,π H,π S,the
12 974 J. O. BERGER AND D. SUN frequentist coverage probabilities of credible intervals for any of these parameters will depend only on ρ. We will show that the frequentist coverage probabilities could be exact under the prior π ab.sinceη (η is a monotone function of σ (θ, we consider only ρ and the last 5 parameters Coverage probabilities under π ab. THEOREM. (a For ψ defined in (8, the posterior α quantile of ρ is ρ α = ψ(y α. (b For any α (0,, ξ = (μ,μ,σ,σ and ρ (,, P(ρ<ρ α (6 ξ,ρ ( ρ Z 3 + ρ χ ( n ρ Z3 χ = P > + ρ n a ρ. χn χn b α (c (6 equals α if and only if the right Haar prior is used, that is, (a, b = (,. (7 THEOREM 3. (a For any α (0,, ξ = (μ,μ,σ,σ and ρ (,, P ( η 3 <(η 3 α ξ,ρ ρ χn (Z 3 + ρ = P χ n (Z3 + ρ χ ρ n < ρ χn b α (b (7 equals α for any <ρ< if and only if b =.. (a The constructive posterior of θ = ρσ /σ has the expression THEOREM 4. (8 θ = r s Z 3 s χ n b r s s. (b For any α (0,, ξ = (μ,μ,σ,σ and ρ (,, P ( θ <(θ α ξ,ρ ( = P t n < n n b (t n b α, which does not depend on ρ. Furthermore, (8 equals α if and only if b =. THEOREM 5. (a The constructive posterior of θ = σ ( ρ is θ = s ( r /χn b. (b For any α (0,, ξ = (μ,μ,σ,σ and ρ (,, P ( θ <(θ α ξ,ρ = P ( (9 χn >(χ n b α, which does not depend on ρ. Furthermore, (9 equals α if and only if b =.
13 BIVARIATE NORMAL 975 THEOREM 6. (a The constructive posterior of θ 3 = is θ 3 = S / ( χ n b. (b For any ξ = (μ,μ,σ,σ and ρ (,, P ( θ 3 <(θ3 α ξ,ρ = P ( (30 χn χ n >(χ n a χ n b α, which does not depend on ρ. Furthermore, (30 equals α iff (a, b is (, or (,. THEOREM 7. (a The constructive posterior of θ 4 is θ4 = s ( r χn b. s (b For any ξ = (μ,μ,σ,σ and ρ (,, P ( θ 4 <(θ4 α ξ,ρ = P ( (3 χn /χ n <(χ n a /χn b α, which does not depend on ρ. Furthermore, (3 equals α iff (a, b = (,. An interesting function of (μ,μ,σ,σ,ρnot of the form (4isθ 5 = μ /σ. THEOREM 8. (a The constructive posterior of θ 5 = μ /σ is θ5 = Z + x χ n s n a. (b For any α (0,, the frequentist coverage of the credible interval (,(θ5 α is P ( θ 5 <(θ5 α μ,μ,σ,σ,ρ (3 ( Z θ 5 n = P < χn ( Z θ 5 n α θ 5, which depends on θ 5 only and equals α if and only if a = First order asymptotic matching. Datta and Mukerjee [9] and Datta and Ghosh [] discuss how to determine first-order matching priors for functions of parameters; these are priors such that the frequentist coverage of a one-sided credible interval is equal to the Bayesian coverage up to a term of order n.for each of the nine objective priors π J,π IJ,π Rρ, π Rσ,π RO,π Rλ,π H,π S and π Rσ,[5] determines if it is a first-order matching prior for each of the parameters μ,μ,σ,σ,ρ, η 3, θ,...,θ 0. The results are listed in Table 5. For example, π J is a first order matching prior for μ,μ,σ,σ,θ,θ 5,θ 7,θ 8, and θ 0, but not for η 3,θ,θ 3 and θ 9.
14 976 J. O. BERGER AND D. SUN TABLE 5 The first-order asymptotic matching of objective priors for μ,μ,σ,σ,ρ, μ μ, η 3, θ j,j =,...,0. Here a boldface letter indicates exact matching Asymptotic matching Prior π(μ,μ,σ,σ,ρ Yes No π J σ σ ( ρ μ, μ, σ, σ ρ μ μ,θ, θ 5,θ 7, θ 8,θ 0 η 3,θ,θ 3,θ 9 π IJ σ σ ( ρ 3/ μ,μ σ,σ,ρ π Rρ π Rσ π RO π Rλ π H σ σ ( ρ μ μ,θ, θ 3,θ 7 η 3,θ,θ 5,θ 8,θ 9,θ 0 μ,μ,ρ σ,σ μ μ,θ 3,θ 7 η 3,θ,θ,θ 5,θ 8,θ 9,θ 0 σ σ ( ρ ρ μ,μ σ,σ,ρ μ μ,η 3,θ 3,θ 7 θ,θ,θ 5,θ 8,θ 9,θ 0 σ σ ( ρ 3/ μ,μ, σ σ,ρ μ μ,θ, θ 5 η 3,θ,θ 3,θ 7,θ 8,θ 9,θ 0 [σ σ ( ρ ] ((σ /σ (σ /σ +4ρ μ,μ σ,σ,ρ σ ( ρ μ μ,θ 3 η 3,θ,θ,θ 5, θ 7,θ 8,θ 9,θ 0 μ,μ, σ, ρ σ μ μ, η 3, θ, θ, θ 3, θ 4, θ 5 θ 7,θ 8,θ 9,θ 0 π S σ σ μ,μ σ,σ,ρ μ μ,θ 3,θ 7 η 3,θ,θ,θ 5,θ 8,θ 9,θ 0 π Rσ +ρ σ σ ( ρ μ,μ σ,σ,ρ μ μ,θ 3,θ 7,θ 9 θ,θ,η 3,θ 5,θ 8,θ Numerically computed coverage and recommendations. First-order matching is only an asymptotic property, and finite sample performance is also crucial. We thus also implemented a modest numerical study, comparing the numerical values of frequentist coverages of the one-sided credible sets P(θ >q 0.05 and P(θ <q 0.95, for the parameters, θ, listed in Table 6 and for the eight objective priors π J,π IJ,π Rρ,π Rσ, π RO,π Rλ,π H and π S.Asusual,q α = q α (X is the posterior α-quantile of θ, and the coverage probability is computed based on the sampling distribution of q α (X for the fixed parameter (μ,μ,σ,σ and ρ. Many of the coverage probabilities depend only on ρ, which was thus chosen to be the x-axis in the graphs. We considered the case n = 3 (the minimal possible sample size and hence the most challenging in terms of obtaining good coverage and the two scenarios Case a: (μ,μ,σ,σ = (0, 0,,, andcase b: (μ,μ,σ,σ = (0, 0,,.
15 BIVARIATE NORMAL 977 TABLE 6 Performance of objective priors for each of the parameters Prior Parameter Bad Medium Good μ rest π RO,π H,π J μ μ rest π J, π RO σ π IJ rest π H,π Rλ,π MS σ π H,π RO,π IJ rest π J ρ π J,π IJ,π S,π RO π Rρ,π Rσ,π Rλ,π H,π MS λ rest π J,π Rλ,π RO θ 3 = π RO,π J rest π IJ,π H θ 7 = σ σ π H,π J,π RO,π Rλ rest θ 9 = σ π J,π IJ (due to size rest π H,π Rρ,π Rσ Here we present the numerical results concerning coverage for only two of the parameters: ρ in Figure and θ 7 = σ /σ in Figure. Table6 summarizes the results from the entire numerical study, the details of which can be found in [5]. The recommendations made in Table for the boxed parameters are justified from these numerical results as follows. FIG.. Frequentist coverages for ρ, where Case a: (μ,μ,σ,σ = (0, 0,,, and Case b: (μ,μ,σ,σ = (0, 0,,. Thex-axisisforρ (,.
16 978 J. O. BERGER AND D. SUN FIG.. Frequentist coverages for θ 7 = σ /σ, where Case a: (μ,μ,σ,σ = (0, 0,, and Case b: (μ,μ,σ,σ = (0, 0,,. Thex-axisisforρ (,. The inferences involving the nonboxed parameters in Table are given in closed form in Table (and so are computationally simple, and are exact frequentist matching. Furthermore, with the exception of μ /σ and η 3, the nonboxed parameters have the indicated priors as one-at-a-time reference priors, so all three criteria point to the indicated recommendation. For ρ, we recommend using π Rρ, since this prior is a one-at-a-time-reference for ρ, first-order matching (as shown in Table 5, and has excellent numerical coverage as shown in Figure. Note that some might prefer to use the right-haar prior because of its exact matching for ρ (even though it exhibits a marginalization paradox. For σ /σ, the one-at-a-time reference prior was also π Rρ.Asthiswas first-order frequentist matching and among the best in terms of numerical coverage (see Figure, we also recommend it for this parameter. For λ, the situation is unclear. The one-at-a-time reference prior is π Rλ and is hence our recommendation, but first-order matching results for this parameter are not known, and the numerical coverages of all priors were rather bad. For σ,the only first-order matching prior among our candidates is π Rσ. It also had the best numerical coverages, and so is a clear recommendation. Note, however, that we were not able to determine if it is a one-at-a-time reference prior for σ,sothe recommendation should be considered tentative. The most interesting question is what to recommend for general use, as an allpurpose prior. Looking at Table, it might seem that π H or even π J would be good choices, since they are optimal for so many parameters. However, both these priors
17 BIVARIATE NORMAL 979 can also give quite bad coverages, as indicated in Figure for π H and in Figures and for π J. Indeed, from Table 6, the only priors that did not have significantly poor performance for at least one parameter (other than λ, for which no prior gave good coverages were π Rρ and π Rσ. The numerical coverages for π Rρ and π Rσ are virtually identical for all the parameters, so there is no principled way to choose between them. π Rρ is a commonly used prior and somewhat simpler, so it becomes our recommended choice for a general prior. 5. Proofs. Due to space limitations, we give only the proofs of Theorems, and 8, because their proofs are quite different. The proofs of the other theorems in Section 4 are relatively easy consequences of Fact and Lemmas 3. For details of these other proofs, see [5]. 5.. Proof of Theorem. With the constant prior for (μ,μ,themarginal likelihood of (σ,σ,ρdepends on S and is proportional to Define (n / exp { trace(s }. D ={(σ,σ,ρ : σ d σ d g(ρ <σ d G(X,σ,σ,ρ= π(σ,σ,ρ Sdσ dσ dρ. Clearly, the frequentist coverage probability is D σ d g(ρ}, P {θ <θ α (X μ,μ,σ,σ,ρ}=p {G(S,σ,σ,ρ< α σ,σ,ρ}. Under the prior (5, G(X,σ,σ,ρ= D h(ρ exp( 0.5 trace(s σ (n +c σ (n +c h(ρ exp( 0.5 trace(s ( ρ dσ (n / dσ dρ, σ (n +c σ (n +c ( ρ dσ (n / dσ dρ where is the symmetric matrix, whose diagonal elements are σ and σ, and off-diagonal element is σ σ ρ. Denote = diag(/σ, /σ and make transformations S S T = S = σ ( σ σ ω S S and = = ω ω ρ ω ω ρ ω. σ σ σ Clearly trace(s = trace(t,andthen h(ρ exp( 0.5 trace(t D dω ω n +c ω n +c ( ρ G(X,σ,σ,ρ= (n / dω dρ, h(ρ exp( 0.5 trace(t dω ω n +c ω n +c ( ρ (n / dω dρ
18 980 J. O. BERGER AND D. SUN where D ={(ω,ω,ρ : ω d ωd g(ρ <g(ρ}. Since the sampling distribution of T depends only on ρ, so does the sampling distribution of G(X,σ,σ,ρ.Also D depends on ρ only. The result thus holds. 5.. Proof of Theorem. It follows from (8 and Lemma 3 (a that ( Z P(ρ<ρ α {[ψ ξ,ρ= P 3 χ ] n b r } + >ρ ρ, r α Note that ψ, definedin(8, is invertible, and ψ (ρ = ρ/ ρ, for ρ <. It follows from Lemma 3 (a and (b that (( Z P(ρ<ρ α ξ,ρ= P 3 + (( Z = P 3 χn b It follows from ( (3 that Consequently, P(ρ<ρ α ξ,ρ ( Z3 = P + χn χn b ρ ρ r r χn b ρ ρ α r = s / s r s ( r = σ ρ Z 3 + (ρσ /σ s σ ρ χ n ρ ρ = Z 3 + χn ρ ρ χn ( Z < 3 + χn χn b χn. χn + ρ ρ α > 0 ρ r r > 0 ρ. χn b ρ α This completes the proof of part (a. For part (b, if (6 equals to α for any <ρ<, choose ρ = 0 and get ( ( Z3 Z P < 3 = α, χn χn b α which implies that b =. Substituting b = into(6 showsthata =..
19 BIVARIATE NORMAL Proof Theorem 8. Part (a is obvious. For part (b, since x = μ + Z σ / n and Z and χn are independent, we have ([ ( Z θ5 <(θ5 ( α = + θ χ n a 5 + Z χ ] n a > 0. n n It follows from Lemma 3 (a and (b that ( θ5 <(θ 5 α = ([ Z χ n a ( = Z χn χ n ( n n + θ 5 χn ( n Z θ 5 < χn χ n + Z χ n α ] α n θ 5 α > 0 Because Z and Z have the same distribution and Z and χn are independent, (3 holds. If (3 equals α for any θ 5, choose θ 5 = 0, ( ( Z Z P < = α, χn α which implies that a =. The result holds. Acknowledgments. The authors are grateful to Fei Liu for performing the numerical frequentist coverage computations, to Xiaoyan Lin for computing the matching priors in Table 5, and to Susie Bayarri for helpful discussions. The authors gratefully acknowledge the very constructive comments of the editor, an associate editor and two referees. REFERENCES [] BAYARRI, M. J. (98. Inferencia bayesiana sobre el coeficiente de correlación de una población normal bivariante. Trabajos de Estadistica e Investigacion Operativa MR [] BAYARRI, M. J. and BERGER, J. (004. The interplay between Bayesian and frequentist analysis. Statist. Sci MR0847 [3] BERGER, J.O.andBERNARDO, J. M. (99. On the development of reference priors (with discussion. In Bayesian Statistics Oxford Univ. Press. MR38069 [4] BERGER, J. O., STRAWDERMAN, W. and TANG, D. (005. Posterior propriety and admissibility of hyperpriors in normal hierarchical models. Ann. Statist MR6354 [5] BERGER, J.O.andSUN, D. (006. Objective priors for a bivariate normal model with multivariate generalizations. Technical Report 07-06, ISDS, Duke Univ. [6] BERNARDO, J. M. (979. Reference posterior distributions for Bayesian inference (with discussion. J. Roy. Statist. Soc. Ser. B MR [7] BRILLINGER, D. R. (96. Examples bearing on the definition of fiducial probability with a bibliography. Ann. Math. Statist MR0483.
20 98 J. O. BERGER AND D. SUN [8] BROWN, P., LE, N. and ZIDEK, J. (994. Inference for a covariance matrix. In Aspects of Uncertainty: A Tribute to D. V. Lindley (P. R. Freeman and A. F. M. Smith, eds Wiley, Chichester. MR [9] DATTA, G.andMUKERJEE, R. (004. Probability Matching Priors: Higher Order Asymptotics. Springer, New York. MR [0] DATTA, G. S. and GHOSH, J. K. (995a. On priors providing frequentist validity for Bayesian inference. Biometrika MR33838 [] DATTA, G. S. and GHOSH, J. K. (995b. Noninformative priors for maximal invariant parameter in group models. Test MR36504 [] DATTA, G. S. and GHOSH, M. (995c. Some remarks on noninformative priors. J. Amer. Statist. Assoc MR [3] DAWID, A. P., STONE, M. and ZIDEK, J. V. (973. Marginalization paradoxes in Bayesian and structural inference (with discussion. J. Roy. Statist. Soc. Ser. B MR [4] FISHER, R. A. (930. Inverse probability. Proc. Cambridge Philos. Soc [5] FISHER, R. A. (956. Statistical Methods and Scientific Inference. Oliver and Boyd, Edinburgh. [6] GEISSER, S. and CORNFIELD, J. (963. Posterior distributions for multivariate normal parameters. J. Roy. Statist. Soc. Ser. B MR07354 [7] GHOSH, M. and YANG, M.-C. (996. Noninformative priors for the two sample normal problem. Test MR40459 [8] LEHMANN, E. L. (986. Testing Statistical Hypotheses, nd ed. Wiley, New York. MR [9] LINDLEY, D. V. (96. The use of prior probability distributions in statistical inference and decisions. Proc. 4th Berkeley Sympos. Math. Statist. Probab. (J. Neyman and E. L. Scott, eds Univ. California Press, Berkeley. MR [0] LINDLEY, D. V. (965. Introduction to Probability and Statistics from a Bayesian Viewpoint. Cambridge Univ. Press. [] PRATT, J. W. (963. Shorter confidence intervals for the mean of a normal distribution with known variance. Ann. Math. Statist MR04850 [] RAO, C. R. (973. Linear Statistical Inference and Its Applications. Wiley, New York. MR [3] SEVERINI, T. A., MUKERJEE, R. and GHOSH, M. (00. On an exact probability matching property of right-invariant priors. Biometrika MR94654 [4] STONE, M.andDAWID, A. P. (97. Un-Bayesian implications of improper Bayes inference in routine statistical problems. Biometrika MR [5] YANG, R. and BERGER, J. (004. Estimation of a covariance matrix using the reference prior. Ann. Statist. 95. MR397 ISDS DUKE UNIVERSITY BOX 905 DURHAM, NORTH CAROLINA NC USA berger@stat.duke.edu URL: DEPARTMENT OF STATISTICS UNIVERSITY OF MISSOURI-COLUMBIA 46 MIDDLEBUSH HALL COLUMBIA, MISSOURI USA sund@missouri.edu URL:
Some Curiosities Arising in Objective Bayesian Analysis
. Some Curiosities Arising in Objective Bayesian Analysis Jim Berger Duke University Statistical and Applied Mathematical Institute Yale University May 15, 2009 1 Three vignettes related to John s work
More informationOverall Objective Priors
Overall Objective Priors Jim Berger, Jose Bernardo and Dongchu Sun Duke University, University of Valencia and University of Missouri Recent advances in statistical inference: theory and case studies University
More informationFiducial Inference and Generalizations
Fiducial Inference and Generalizations Jan Hannig Department of Statistics and Operations Research The University of North Carolina at Chapel Hill Hari Iyer Department of Statistics, Colorado State University
More informationEstimation of a multivariate normal covariance matrix with staircase pattern data
AISM (2007) 59: 211 233 DOI 101007/s10463-006-0044-x Xiaoqian Sun Dongchu Sun Estimation of a multivariate normal covariance matrix with staircase pattern data Received: 20 January 2005 / Revised: 1 November
More informationTest Volume 11, Number 1. June 2002
Sociedad Española de Estadística e Investigación Operativa Test Volume 11, Number 1. June 2002 Optimal confidence sets for testing average bioequivalence Yu-Ling Tseng Department of Applied Math Dong Hwa
More informationA simple analysis of the exact probability matching prior in the location-scale model
A simple analysis of the exact probability matching prior in the location-scale model Thomas J. DiCiccio Department of Social Statistics, Cornell University Todd A. Kuffner Department of Mathematics, Washington
More informationarxiv: v2 [math.st] 28 Apr 2017
arxiv: arxiv:5.88 Analytic Posteriors for Pearson s Correlation Coefficient arxiv:5.88v [math.st] 8 Apr 7 Alexander Ly, Maarten Marsman and Eric-Jan Wagenmakers University of Amsterdam Weesperplein 4 8
More informationNoninformative Priors for the Ratio of the Scale Parameters in the Inverted Exponential Distributions
Communications for Statistical Applications and Methods 03, Vol. 0, No. 5, 387 394 DOI: http://dx.doi.org/0.535/csam.03.0.5.387 Noninformative Priors for the Ratio of the Scale Parameters in the Inverted
More informationInvariant HPD credible sets and MAP estimators
Bayesian Analysis (007), Number 4, pp. 681 69 Invariant HPD credible sets and MAP estimators Pierre Druilhet and Jean-Michel Marin Abstract. MAP estimators and HPD credible sets are often criticized in
More informationDefault priors and model parametrization
1 / 16 Default priors and model parametrization Nancy Reid O-Bayes09, June 6, 2009 Don Fraser, Elisabeta Marras, Grace Yun-Yi 2 / 16 Well-calibrated priors model f (y; θ), F(y; θ); log-likelihood l(θ)
More informationTesting Simple Hypotheses R.L. Wolpert Institute of Statistics and Decision Sciences Duke University, Box Durham, NC 27708, USA
Testing Simple Hypotheses R.L. Wolpert Institute of Statistics and Decision Sciences Duke University, Box 90251 Durham, NC 27708, USA Summary: Pre-experimental Frequentist error probabilities do not summarize
More informationthe unification of statistics its uses in practice and its role in Objective Bayesian Analysis:
Objective Bayesian Analysis: its uses in practice and its role in the unification of statistics James O. Berger Duke University and the Statistical and Applied Mathematical Sciences Institute Allen T.
More informationIntegrated Objective Bayesian Estimation and Hypothesis Testing
Integrated Objective Bayesian Estimation and Hypothesis Testing José M. Bernardo Universitat de València, Spain jose.m.bernardo@uv.es 9th Valencia International Meeting on Bayesian Statistics Benidorm
More informationStat260: Bayesian Modeling and Inference Lecture Date: February 10th, Jeffreys priors. exp 1 ) p 2
Stat260: Bayesian Modeling and Inference Lecture Date: February 10th, 2010 Jeffreys priors Lecturer: Michael I. Jordan Scribe: Timothy Hunter 1 Priors for the multivariate Gaussian Consider a multivariate
More information1 Hypothesis Testing and Model Selection
A Short Course on Bayesian Inference (based on An Introduction to Bayesian Analysis: Theory and Methods by Ghosh, Delampady and Samanta) Module 6: From Chapter 6 of GDS 1 Hypothesis Testing and Model Selection
More informationA noninformative Bayesian approach to domain estimation
A noninformative Bayesian approach to domain estimation Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 glen@stat.umn.edu August 2002 Revised July 2003 To appear in Journal
More informationMore Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order Restriction
Sankhyā : The Indian Journal of Statistics 2007, Volume 69, Part 4, pp. 700-716 c 2007, Indian Statistical Institute More Powerful Tests for Homogeneity of Multivariate Normal Mean Vectors under an Order
More information7. Estimation and hypothesis testing. Objective. Recommended reading
7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing
More informationTheory and Methods of Statistical Inference. PART I Frequentist theory and methods
PhD School in Statistics cycle XXVI, 2011 Theory and Methods of Statistical Inference PART I Frequentist theory and methods (A. Salvan, N. Sartori, L. Pace) Syllabus Some prerequisites: Empirical distribution
More informationDivergence Based priors for the problem of hypothesis testing
Divergence Based priors for the problem of hypothesis testing gonzalo garcía-donato and susie Bayarri May 22, 2009 gonzalo garcía-donato and susie Bayarri () DB priors May 22, 2009 1 / 46 Jeffreys and
More informationThe Jeffreys Prior. Yingbo Li MATH Clemson University. Yingbo Li (Clemson) The Jeffreys Prior MATH / 13
The Jeffreys Prior Yingbo Li Clemson University MATH 9810 Yingbo Li (Clemson) The Jeffreys Prior MATH 9810 1 / 13 Sir Harold Jeffreys English mathematician, statistician, geophysicist, and astronomer His
More information7. Estimation and hypothesis testing. Objective. Recommended reading
7. Estimation and hypothesis testing Objective In this chapter, we show how the election of estimators can be represented as a decision problem. Secondly, we consider the problem of hypothesis testing
More informationStudentization and Prediction in a Multivariate Normal Setting
Studentization and Prediction in a Multivariate Normal Setting Morris L. Eaton University of Minnesota School of Statistics 33 Ford Hall 4 Church Street S.E. Minneapolis, MN 55455 USA eaton@stat.umn.edu
More informationParameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1
Parameter Estimation William H. Jefferys University of Texas at Austin bill@bayesrules.net Parameter Estimation 7/26/05 1 Elements of Inference Inference problems contain two indispensable elements: Data
More informationTheory and Methods of Statistical Inference
PhD School in Statistics cycle XXIX, 2014 Theory and Methods of Statistical Inference Instructors: B. Liseo, L. Pace, A. Salvan (course coordinator), N. Sartori, A. Tancredi, L. Ventura Syllabus Some prerequisites:
More informationRemarks on Improper Ignorance Priors
As a limit of proper priors Remarks on Improper Ignorance Priors Two caveats relating to computations with improper priors, based on their relationship with finitely-additive, but not countably-additive
More informationTheory and Methods of Statistical Inference. PART I Frequentist likelihood methods
PhD School in Statistics XXV cycle, 2010 Theory and Methods of Statistical Inference PART I Frequentist likelihood methods (A. Salvan, N. Sartori, L. Pace) Syllabus Some prerequisites: Empirical distribution
More informationMean. Pranab K. Mitra and Bimal K. Sinha. Department of Mathematics and Statistics, University Of Maryland, Baltimore County
A Generalized p-value Approach to Inference on Common Mean Pranab K. Mitra and Bimal K. Sinha Department of Mathematics and Statistics, University Of Maryland, Baltimore County 1000 Hilltop Circle, Baltimore,
More informationMultivariate Normal & Wishart
Multivariate Normal & Wishart Hoff Chapter 7 October 21, 2010 Reading Comprehesion Example Twenty-two children are given a reading comprehsion test before and after receiving a particular instruction method.
More information1 Data Arrays and Decompositions
1 Data Arrays and Decompositions 1.1 Variance Matrices and Eigenstructure Consider a p p positive definite and symmetric matrix V - a model parameter or a sample variance matrix. The eigenstructure is
More informationParametric Techniques
Parametric Techniques Jason J. Corso SUNY at Buffalo J. Corso (SUNY at Buffalo) Parametric Techniques 1 / 39 Introduction When covering Bayesian Decision Theory, we assumed the full probabilistic structure
More informationBayesian Inference. Chapter 9. Linear models and regression
Bayesian Inference Chapter 9. Linear models and regression M. Concepcion Ausin Universidad Carlos III de Madrid Master in Business Administration and Quantitative Methods Master in Mathematical Engineering
More informationStat 5101 Lecture Notes
Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random
More informationAlternative Bayesian Estimators for Vector-Autoregressive Models
Alternative Bayesian Estimators for Vector-Autoregressive Models Shawn Ni, Department of Economics, University of Missouri, Columbia, MO 65211, USA Dongchu Sun, Department of Statistics, University of
More informationParametric Techniques Lecture 3
Parametric Techniques Lecture 3 Jason Corso SUNY at Buffalo 22 January 2009 J. Corso (SUNY at Buffalo) Parametric Techniques Lecture 3 22 January 2009 1 / 39 Introduction In Lecture 2, we learned how to
More informationIntroduction to Bayesian Methods
Introduction to Bayesian Methods Jessi Cisewski Department of Statistics Yale University Sagan Summer Workshop 2016 Our goal: introduction to Bayesian methods Likelihoods Priors: conjugate priors, non-informative
More informationParameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn
Parameter estimation and forecasting Cristiano Porciani AIfA, Uni-Bonn Questions? C. Porciani Estimation & forecasting 2 Temperature fluctuations Variance at multipole l (angle ~180o/l) C. Porciani Estimation
More informationSupplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements
Supplement to A Hierarchical Approach for Fitting Curves to Response Time Measurements Jeffrey N. Rouder Francis Tuerlinckx Paul L. Speckman Jun Lu & Pablo Gomez May 4 008 1 The Weibull regression model
More informationObjective Bayesian Hypothesis Testing
Objective Bayesian Hypothesis Testing José M. Bernardo Universitat de València, Spain jose.m.bernardo@uv.es Statistical Science and Philosophy of Science London School of Economics (UK), June 21st, 2010
More informationThe Bayesian Choice. Christian P. Robert. From Decision-Theoretic Foundations to Computational Implementation. Second Edition.
Christian P. Robert The Bayesian Choice From Decision-Theoretic Foundations to Computational Implementation Second Edition With 23 Illustrations ^Springer" Contents Preface to the Second Edition Preface
More informationComparing Non-informative Priors for Estimation and Prediction in Spatial Models
Environmentrics 00, 1 12 DOI: 10.1002/env.XXXX Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Regina Wu a and Cari G. Kaufman a Summary: Fitting a Bayesian model to spatial
More informationarxiv: v1 [stat.ap] 27 Mar 2015
Submitted to the Annals of Applied Statistics A NOTE ON THE SPECIFIC SOURCE IDENTIFICATION PROBLEM IN FORENSIC SCIENCE IN THE PRESENCE OF UNCERTAINTY ABOUT THE BACKGROUND POPULATION By Danica M. Ommen,
More informationCOMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS
Communications in Statistics - Simulation and Computation 33 (2004) 431-446 COMPARISON OF FIVE TESTS FOR THE COMMON MEAN OF SEVERAL MULTIVARIATE NORMAL POPULATIONS K. Krishnamoorthy and Yong Lu Department
More informationHIERARCHICAL BAYESIAN ANALYSIS OF BINARY MATCHED PAIRS DATA
Statistica Sinica 1(2), 647-657 HIERARCHICAL BAYESIAN ANALYSIS OF BINARY MATCHED PAIRS DATA Malay Ghosh, Ming-Hui Chen, Atalanta Ghosh and Alan Agresti University of Florida, Worcester Polytechnic Institute
More informationBAYESIAN ANALYSIS OF BIVARIATE COMPETING RISKS MODELS
Sankhyā : The Indian Journal of Statistics 2000, Volume 62, Series B, Pt. 3, pp. 388 401 BAYESIAN ANALYSIS OF BIVARIATE COMPETING RISKS MODELS By CHEN-PIN WANG University of South Florida, Tampa, U.S.A.
More informationComparing Non-informative Priors for Estimation and. Prediction in Spatial Models
Comparing Non-informative Priors for Estimation and Prediction in Spatial Models Vigre Semester Report by: Regina Wu Advisor: Cari Kaufman January 31, 2010 1 Introduction Gaussian random fields with specified
More informationA CONDITION TO OBTAIN THE SAME DECISION IN THE HOMOGENEITY TEST- ING PROBLEM FROM THE FREQUENTIST AND BAYESIAN POINT OF VIEW
A CONDITION TO OBTAIN THE SAME DECISION IN THE HOMOGENEITY TEST- ING PROBLEM FROM THE FREQUENTIST AND BAYESIAN POINT OF VIEW Miguel A Gómez-Villegas and Beatriz González-Pérez Departamento de Estadística
More informationAn Extended BIC for Model Selection
An Extended BIC for Model Selection at the JSM meeting 2007 - Salt Lake City Surajit Ray Boston University (Dept of Mathematics and Statistics) Joint work with James Berger, Duke University; Susie Bayarri,
More informationEstimating the Correlation in Bivariate Normal Data with Known Variances and Small Sample Sizes 1
Estimating the Correlation in Bivariate Normal Data with Known Variances and Small Sample Sizes 1 Bailey K. Fosdick and Adrian E. Raftery Department of Statistics University of Washington Technical Report
More informationA BAYESIAN MATHEMATICAL STATISTICS PRIMER. José M. Bernardo Universitat de València, Spain
A BAYESIAN MATHEMATICAL STATISTICS PRIMER José M. Bernardo Universitat de València, Spain jose.m.bernardo@uv.es Bayesian Statistics is typically taught, if at all, after a prior exposure to frequentist
More informationHypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006
Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312 This draft, April 20, 2006 1 1 A Brief Review of Hypothesis Testing and Its Uses values and pure significance tests (R.A. Fisher)
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Department of Forestry & Department of Geography, Michigan State University, Lansing Michigan, U.S.A. 2 Biostatistics, School of Public
More informationBayesian Econometrics
Bayesian Econometrics Christopher A. Sims Princeton University sims@princeton.edu September 20, 2016 Outline I. The difference between Bayesian and non-bayesian inference. II. Confidence sets and confidence
More informationThe Jackknife-Like Method for Assessing Uncertainty of Point Estimates for Bayesian Estimation in a Finite Gaussian Mixture Model
Thai Journal of Mathematics : 45 58 Special Issue: Annual Meeting in Mathematics 207 http://thaijmath.in.cmu.ac.th ISSN 686-0209 The Jackknife-Like Method for Assessing Uncertainty of Point Estimates for
More informationConstructing Ensembles of Pseudo-Experiments
Constructing Ensembles of Pseudo-Experiments Luc Demortier The Rockefeller University, New York, NY 10021, USA The frequentist interpretation of measurement results requires the specification of an ensemble
More informationAsymptotic efficiency of simple decisions for the compound decision problem
Asymptotic efficiency of simple decisions for the compound decision problem Eitan Greenshtein and Ya acov Ritov Department of Statistical Sciences Duke University Durham, NC 27708-0251, USA e-mail: eitan.greenshtein@gmail.com
More informationA Very Brief Summary of Statistical Inference, and Examples
A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)
More informationBayesian Linear Regression
Bayesian Linear Regression Sudipto Banerjee 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. September 15, 2010 1 Linear regression models: a Bayesian perspective
More informationLong-Run Covariability
Long-Run Covariability Ulrich K. Müller and Mark W. Watson Princeton University October 2016 Motivation Study the long-run covariability/relationship between economic variables great ratios, long-run Phillips
More informationNoninformative Priors Do Not Exist: A Discussion with José M. Bernardo
Departament d Estadística i I.O., Universitat de València. Facultat de Matemàtiques, 46100 Burjassot, València, Spain. Tel. 34.6.363.6048, Fax 34.6.363.6048 (direct), 34.6.386.4735 (office) Internet: bernardo@uv.es,
More informationInference on reliability in two-parameter exponential stress strength model
Metrika DOI 10.1007/s00184-006-0074-7 Inference on reliability in two-parameter exponential stress strength model K. Krishnamoorthy Shubhabrata Mukherjee Huizhen Guo Received: 19 January 2005 Springer-Verlag
More informationPrinciples of Bayesian Inference
Principles of Bayesian Inference Sudipto Banerjee University of Minnesota July 20th, 2008 1 Bayesian Principles Classical statistics: model parameters are fixed and unknown. A Bayesian thinks of parameters
More informationNovember 2002 STA Random Effects Selection in Linear Mixed Models
November 2002 STA216 1 Random Effects Selection in Linear Mixed Models November 2002 STA216 2 Introduction It is common practice in many applications to collect multiple measurements on a subject. Linear
More informationMonte Carlo conditioning on a sufficient statistic
Seminar, UC Davis, 24 April 2008 p. 1/22 Monte Carlo conditioning on a sufficient statistic Bo Henry Lindqvist Norwegian University of Science and Technology, Trondheim Joint work with Gunnar Taraldsen,
More informationBayesian Inference. Chapter 4: Regression and Hierarchical Models
Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Advanced Statistics and Data Mining Summer School
More informationComment on Article by Scutari
Bayesian Analysis (2013) 8, Number 3, pp. 543 548 Comment on Article by Scutari Hao Wang Scutari s paper studies properties of the distribution of graphs ppgq. This is an interesting angle because it differs
More informationFULL LIKELIHOOD INFERENCES IN THE COX MODEL
October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach
More informationPMR Learning as Inference
Outline PMR Learning as Inference Probabilistic Modelling and Reasoning Amos Storkey Modelling 2 The Exponential Family 3 Bayesian Sets School of Informatics, University of Edinburgh Amos Storkey PMR Learning
More informationBayesian Analysis of RR Lyrae Distances and Kinematics
Bayesian Analysis of RR Lyrae Distances and Kinematics William H. Jefferys, Thomas R. Jefferys and Thomas G. Barnes University of Texas at Austin, USA Thanks to: Jim Berger, Peter Müller, Charles Friedman
More informationMCMC algorithms for fitting Bayesian models
MCMC algorithms for fitting Bayesian models p. 1/1 MCMC algorithms for fitting Bayesian models Sudipto Banerjee sudiptob@biostat.umn.edu University of Minnesota MCMC algorithms for fitting Bayesian models
More informationObjective Bayesian Statistical Inference
Objective Bayesian Statistical Inference James O. Berger Duke University and the Statistical and Applied Mathematical Sciences Institute London, UK July 6-8, 2005 1 Preliminaries Outline History of objective
More informationBayesian Linear Models
Bayesian Linear Models Sudipto Banerjee 1 and Andrew O. Finley 2 1 Biostatistics, School of Public Health, University of Minnesota, Minneapolis, Minnesota, U.S.A. 2 Department of Forestry & Department
More informationInferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data
Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone
More informationSTATS 200: Introduction to Statistical Inference. Lecture 29: Course review
STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout
More informationBayesian Inference: Concept and Practice
Inference: Concept and Practice fundamentals Johan A. Elkink School of Politics & International Relations University College Dublin 5 June 2017 1 2 3 Bayes theorem In order to estimate the parameters of
More informationOptimal Hypothesis Testing: From Semi to Fully Bayes Factors
Noname manuscript No. (will be inserted by the editor) Optimal Hypothesis Testing: From Semi to Fully Bayes Factors Optimal Hypothesis Testing Albert Vexler Chengqing Wu Kai Fun Yu Received: date / Accepted:
More informationBayesian Inference for the Multivariate Normal
Bayesian Inference for the Multivariate Normal Will Penny Wellcome Trust Centre for Neuroimaging, University College, London WC1N 3BG, UK. November 28, 2014 Abstract Bayesian inference for the multivariate
More informationASSESSING A VECTOR PARAMETER
SUMMARY ASSESSING A VECTOR PARAMETER By D.A.S. Fraser and N. Reid Department of Statistics, University of Toronto St. George Street, Toronto, Canada M5S 3G3 dfraser@utstat.toronto.edu Some key words. Ancillary;
More informationFrequentist-Bayesian Model Comparisons: A Simple Example
Frequentist-Bayesian Model Comparisons: A Simple Example Consider data that consist of a signal y with additive noise: Data vector (N elements): D = y + n The additive noise n has zero mean and diagonal
More informationPart III. A Decision-Theoretic Approach and Bayesian testing
Part III A Decision-Theoretic Approach and Bayesian testing 1 Chapter 10 Bayesian Inference as a Decision Problem The decision-theoretic framework starts with the following situation. We would like to
More informationContents. Part I: Fundamentals of Bayesian Inference 1
Contents Preface xiii Part I: Fundamentals of Bayesian Inference 1 1 Probability and inference 3 1.1 The three steps of Bayesian data analysis 3 1.2 General notation for statistical inference 4 1.3 Bayesian
More informationWhy Try Bayesian Methods? (Lecture 5)
Why Try Bayesian Methods? (Lecture 5) Tom Loredo Dept. of Astronomy, Cornell University http://www.astro.cornell.edu/staff/loredo/bayes/ p.1/28 Today s Lecture Problems you avoid Ambiguity in what is random
More informationBayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence
Bayesian Inference in GLMs Frequentists typically base inferences on MLEs, asymptotic confidence limits, and log-likelihood ratio tests Bayesians base inferences on the posterior distribution of the unknowns
More informationBayesian Methods for Estimating the Reliability of Complex Systems Using Heterogeneous Multilevel Information
Statistics Preprints Statistics 8-2010 Bayesian Methods for Estimating the Reliability of Complex Systems Using Heterogeneous Multilevel Information Jiqiang Guo Iowa State University, jqguo@iastate.edu
More informationA Very Brief Summary of Bayesian Inference, and Examples
A Very Brief Summary of Bayesian Inference, and Examples Trinity Term 009 Prof Gesine Reinert Our starting point are data x = x 1, x,, x n, which we view as realisations of random variables X 1, X,, X
More informationMaster s Written Examination
Master s Written Examination Option: Statistics and Probability Spring 016 Full points may be obtained for correct answers to eight questions. Each numbered question which may have several parts is worth
More informationBayesian vs frequentist techniques for the analysis of binary outcome data
1 Bayesian vs frequentist techniques for the analysis of binary outcome data By M. Stapleton Abstract We compare Bayesian and frequentist techniques for analysing binary outcome data. Such data are commonly
More informationChapter 8: Sampling distributions of estimators Sections
Chapter 8: Sampling distributions of estimators Sections 8.1 Sampling distribution of a statistic 8.2 The Chi-square distributions 8.3 Joint Distribution of the sample mean and sample variance Skip: p.
More informationA REVERSE TO THE JEFFREYS LINDLEY PARADOX
PROBABILITY AND MATHEMATICAL STATISTICS Vol. 38, Fasc. 1 (2018), pp. 243 247 doi:10.19195/0208-4147.38.1.13 A REVERSE TO THE JEFFREYS LINDLEY PARADOX BY WIEBE R. P E S T M A N (LEUVEN), FRANCIS T U E R
More informationBayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units
Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units Sahar Z Zangeneh Robert W. Keener Roderick J.A. Little Abstract In Probability proportional
More informationConfidence distributions in statistical inference
Confidence distributions in statistical inference Sergei I. Bityukov Institute for High Energy Physics, Protvino, Russia Nikolai V. Krasnikov Institute for Nuclear Research RAS, Moscow, Russia Motivation
More informationAustralian & New Zealand Journal of Statistics
Australian & New Zealand Journal of Statistics Aust.N.Z.J.Stat.51(2), 2009, 115 126 doi: 10.1111/j.1467-842X.2009.00548.x ROUTES TO HIGHER-ORDER ACCURACY IN PARAMETRIC INFERENCE G. ALASTAIR YOUNG 1 Imperial
More informationBFF Four: Are we Converging?
BFF Four: Are we Converging? Nancy Reid May 2, 2017 Classical Approaches: A Look Way Back Nature of Probability BFF one to three: a look back Comparisons Are we getting there? BFF Four Harvard, May 2017
More informationMinimax design criterion for fractional factorial designs
Ann Inst Stat Math 205 67:673 685 DOI 0.007/s0463-04-0470-0 Minimax design criterion for fractional factorial designs Yue Yin Julie Zhou Received: 2 November 203 / Revised: 5 March 204 / Published online:
More informationFixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility
American Economic Review: Papers & Proceedings 2016, 106(5): 400 404 http://dx.doi.org/10.1257/aer.p20161082 Fixed Effects, Invariance, and Spatial Variation in Intergenerational Mobility By Gary Chamberlain*
More informationThe linear model is the most fundamental of all serious statistical models encompassing:
Linear Regression Models: A Bayesian perspective Ingredients of a linear model include an n 1 response vector y = (y 1,..., y n ) T and an n p design matrix (e.g. including regressors) X = [x 1,..., x
More informationg-priors for Linear Regression
Stat60: Bayesian Modeling and Inference Lecture Date: March 15, 010 g-priors for Linear Regression Lecturer: Michael I. Jordan Scribe: Andrew H. Chan 1 Linear regression and g-priors In the last lecture,
More informationHarrison B. Prosper. CMS Statistics Committee
Harrison B. Prosper Florida State University CMS Statistics Committee 08-08-08 Bayesian Methods: Theory & Practice. Harrison B. Prosper 1 h Lecture 3 Applications h Hypothesis Testing Recap h A Single
More informationBayesian Inference. Chapter 4: Regression and Hierarchical Models
Bayesian Inference Chapter 4: Regression and Hierarchical Models Conchi Ausín and Mike Wiper Department of Statistics Universidad Carlos III de Madrid Master in Business Administration and Quantitative
More informationSummary of Extending the Rank Likelihood for Semiparametric Copula Estimation, by Peter Hoff
Summary of Extending the Rank Likelihood for Semiparametric Copula Estimation, by Peter Hoff David Gerard Department of Statistics University of Washington gerard2@uw.edu May 2, 2013 David Gerard (UW)
More information