Objective Bayesian and fiducial inference: some results and comparisons. Piero Veronese and Eugenio Melilli Bocconi University, Milano, Italy

Size: px

Start display at page:

Download "Objective Bayesian and fiducial inference: some results and comparisons. Piero Veronese and Eugenio Melilli Bocconi University, Milano, Italy"

Willis Wilkins
5 years ago
Views:

1 Objective Bayesian and fiducial inference: some results and comparisons Piero Veronese and Eugenio Melilli Bocconi University, Milano, Italy Abstract Objective Bayesian analysis and fiducial inference are both attempts to derive probabilities for statements concerning the value of an unknown parameter, so that it is natural to inquire on their relationships. While results in this direction exist for univariate parameter models, where the objective analysis is usually based on the Jeffreys priors, little work has been done for the multivariate case, in which reference priors are typically employed. In this paper we propose a way to construct fiducial distributions, in quite general models, and show that they have many similarities with objective posteriors. First, the fiducial distributions and the reference priors are both constructed using a step-by-step conditional procedure which induces an order on the inferential importance of the components of the parameter. Second, in most cases fiducial distributions and reference posteriors coincide, as it happens for the location-scale models. The proposed procedure is shown to be easier when the model belongs to a particular subclass of the natural exponential family, called conditionally reducible, which includes the multinomial and the negative-multinomial models. In this class we characterize the models for which the fiducial distribution can be seen as a posterior and show that the corresponding prior belongs to the enriched conjugate family and coincides with the reference prior. Finally the asymptotic normality of the fiducial distribution is proved, obtaining the same result holding for regular Bayesian posteriors. Keywords: conditional reducibility, confidence distribution, Jeffreys prior, locationscale parameter model, multinomial model, natural exponential family, quadratic variance function, reference prior. 1 Introduction Objective Bayesian analysis, see e.g. Berger (2006) and Berger, Bernardo & Sun (2015), fiducial inference, see e.g. Fisher (1973) and Hannig (2009, 2013) and confidence distribution theory, see e.g. Schweder & Hjort (2002, 2015), Singh et al. (2005) and Xie & Singh (2013), are in some sense all attempts to face the same problem: to construct a (possibly formal) distribution on the parameter space depending only on the observed 1

2 data. Undoubtedly the leading and more natural approach towards this problem is the first. Indeed, objective Bayesian analysis is a recent name for a very old problem: how to perform a good Bayesian inference, especially for moderate sample size, when one is unwilling or unable to asses a subjective prior. Under this approach, the prior distribution is derived directly from the model and thus is labeled as objective. The reference prior, introduced by Bernardo (1979) and developed by Berger & Bernardo (1992), is the most successful default prior proposed in the literature. It has been deeply studied and its good properties, in particular with respect to the frequentist coverage of confidence sets, are well known. Typically, for a real parameter indexing a regular model, the reference prior coincides with the prior obtained by the Jeffreys rule, which is based on the Fisher information. For a multidimensional parameter the reference prior depends on the grouping and ordering of the components of the parameter and, in general, no longer coincides with the Jeffreys prior, which is known to be unsatisfactory in this context. Unfortunately, the reference prior is generally not simple to derive. Fiducial distributions, after having been introduced by Fisher (1930, 1935) and widely discussed (and criticized) in the subsequent years, have been de facto brushed aside for a long time and only recently they obtained new vitality. Originally Fisher considered a continuous sufficient statistics S with distribution function F θ, depending on a real parameter θ. Let q α (θ) denote the quantile of order α of F θ and let s be a realization of S. If q α (θ) is increasing in θ (i.e., F θ is decreasing in θ), the statement s < q α (θ) is equivalent to θ > qα 1 (s) and thus Fisher assumes qα 1 (s) as the quantile of order 1 α of the fiducial distribution. The set of all quantiles qα 1 (s), α (0, 1), establishes the fiducial distribution function H s (θ) and the corresponding density h s (θ), given by H s (θ) = 1 F θ (s) and h s (θ) = θ F θ(s). (1) Of course H s and h s must be properly modified if F θ is increasing in θ. Fisher (1973, cap.vi) provided some examples of continuous multivariate fiducial distributions obtained by a step-by-step procedure, but he never developed a general and rigorous theory. This fact, with the problems to cover discrete and/or multiparameter models, with some inconsistencies of the fiducial distribution (e.g. the marginalization paradox, see Dawid & Stone, 1982), and the difficulties in its interpretation gave rise to a quite strong negative attitude towards Fisher proposal. It is interesting to notice that one of such inconsistencies, i.e. the lack of invariance of the fiducial distribution with respect to a reparametrization of the model, see Dempster (1993), can be explained by introducing the notion of inferential importance of the parameters, similarly to what happens for reference priors, see e.g. Bernardo & Smith (1994). In the renewed interest for the fiducial approach a relevant role is played by the gen- 2

3 eralized fiducial inference introduced and developed by Hannig (2009, 2013). He provides a formal and mathematically rigorous definition which has a quite general applicability. The crucial element of his definition is a data-generating equation X = G(U, θ), linking the unknown parameter θ and the observed data X through a random element U with known distribution. Roughly speaking, by shifting the randomness induced by U from X to θ (i.e., by inverting G with respect to θ after fixing X = x), the distribution given by the statistical model leads to a distribution for the parameter θ. Contrary to the original idea of Fisher, the generalized fiducial distribution is non-unique and Hannig widely discusses this point. Applications to different statistical models can be found for instance in Hannig et al. (2007), Hannig & Iyer (2008) and Wandler & Hannig (2012). Other recent contributions to the topic of fiducial distributions are given by Taraldsen & Lindqvist (2013), who discuss optimality procedures, by Martin & Liu (2013), who attempt to define a quite general framework for inference with satisfactory long-run behavior and by Veronese & Melilli (2015), henceforth V&M. In this last paper the authors, according to the original Fisher idea, derive fiducial distributions for both discrete and continuous real natural exponential families (NEFs), and discuss some of their properties with particular emphasis on the frequentist coverage of the fiducial intervals. Historically, confidence distributions have typically been constructed by inverting the upper limits of lower sided confidence intervals and have often been associated with a fiducial interpretation. Recently, in Schweder & Hjort (2002) and Singh et al. (2005), a modern definition has been proposed. Confidence distribution theory can be seen as a frequentist setting in which both objective bayesian posteriors and fiducial distributions can be studied and compared. Despite their general importance, confidence distributions are not crucial for the specific aims tackled here. We will return on this topic in Section 5 in connection with possible further developments. The present paper does not discuss the philosophical bases of the different approaches, but want to show how and when they lead to conclusions which are identical, or similar, in several standard situations. Our first goal is to suggest a simple way to construct a (unique) fiducial distribution for sufficiently general discrete and continuous multiparameter models. The proposal uses the step-by-step idea in Fisher (1973) jointly with the results about fiducial distributions for real NEFs proved in V&M. The key-point of the construction is the procedure by conditioning: the distribution of the data is factorized as a product of one-dimensional laws and, for each of these, the fiducial density of a real parameter, possibly conditional on other parameters, is obtained. The joint fiducial distribution of the overall parameter is then defined as the product of the (conditional) one-dimensional fiducial laws. Assumptions on the statistical model needed to implement this procedure are discussed. To clarify the underlying idea, consider the 3

4 very standard example concerning the normal model with both mean µ and variance σ 2 unknown. The joint distribution function of the sufficient statistics X = n i=1 X i/n and S 2 = n i=1 (X i X) 2 /n can be factorized as F X,S 2 ( x, s 2 ) = F X S 2 ( x s 2 )F µ,σ 2 µ,σ 2 σ S2 (s 2 ) 2 and the fiducial distribution for µ and σ 2 can be constructed as the product of the two (univariate) fiducial distributions for σ 2 and for µ given σ 2, derived from Fσ S2 and F X S 2 2 µ,σ 2 respectively. Of course in this case X and S 2 are independent, but this is not crucial in our context. This joint fiducial distribution coincides with the posterior obtained with the reference prior, but is different from that derived from the Jeffreys rule. One of the new aspect of our construction is that it explicitly recognize the relevance of the ordering of the components of the parameter in terms of their inferential importance as it happens in the theory of reference priors. Thus the second goal of the paper is to inquire the relationships between the objective Bayesian posteriors and the suggested fiducial distributions. Lindley (1958) was the first to discuss these connections. Specifically, he proved that for models with a real parameter θ and a real sufficient statistic S, the fiducial distribution and the posterior coincide if and only if there exist transformations of S and θ which allow to rewrite the sampling distribution as a location model. V&M extend this result, within the real NEFs, to discrete models, characterizing all families admitting a fiducial prior, i.e. a prior leading to a posterior equal to the fiducial distribution. This prior, when properly defined for discrete families, coincides with Jeffreys prior. We show here, through several standard examples presenting a nuisance parameter, that a similar relationship exists with the reference priors instead of the Jeffreys ones. In particular, we study models belonging to the class of the so-called conditionally reducible NEF, see Consonni & Veronese (2001). This class of models can be indexed by a suitable parameter which strongly simplifies our procedure because its components are independent under the fiducial distribution. Furthermore, we characterize the conditionally reducible NEFs for which a fiducial prior exists and show that it belongs to the enriched conjugate family. Finally, we prove the asymptotic normality of the fiducial distributions for this class obtaining the same well known result concerning Bayesian posteriors under regularity conditions. The paper is structured as follows. Section 2 collects some basic properties and results on NEFs, including a brief review on the conditionally reducible NEFs. After recalling in Section 3.1 the main results on the fiducial distributions for real NEFs stated in V&M, Section 3.2 presents a proposal to construct a multivariate fiducial distribution in a quite general context. In Section 3.3 we discuss the relationships between the fiducial distributions and the reference posteriors, which are illustrated in Section 3.4 through several classical examples, including a general result on location-scale parameter models. Section 4 establishes fiducial distributions for conditionally reducible 4

5 NEFs and provides their expression for a particular subclass, including multinomial and negative-multinomial models. Families which admit a fiducial prior are characterized and relationships with reference prior are shown. Section 4.3 deals with asymptotic normality. Section 5 highlights the role of confidence distributions as a broad setting in which both objective Bayesian and fiducial inference can be considered and suggests other possible developments. Finally, Appendix A1 collects some useful technical results on conditionally reducible NEFs, Appendix A2 briefly presents the notion of enriched conjugate family while Appendix A3 includes the proofs of all the propositions and the theorems stated in the paper. 2 Preliminaries on natural exponential families 2.1 Basic results This subsection reviews some basic facts about exponential families. For a general treatment see Barndorff-Nielsen (1978), Brown (1986) and, for a Bayesian perspective, Gutiérrez-Peña & Smith (1997). Let ν be a σ-finite positive measure on the Borel sets of R d. Suppose ν is not concentrated on an affine subspace of R d and consider a family F of distributions whose densities with respect to ν are of the form { d } p θ (x) = exp θ k x k M(θ), θ = (θ 1..., θ d ) Θ, x = (x 1,..., x d ) R d, (2) with Θ R d nonempty. When the interior of the natural parameter space N = {θ R d : exp{ d θ kx k } ν(dx) < } is open and Θ = N the family F is said to be a regular natural exponential family (NEF). In the sequel we will consider only regular NEFs. Any NEF can be reparameterized in terms of the mean parameter µ = (µ 1,..., µ d ), where µ = µ(θ) = M(θ)/ θ, because µ( ) is a one-to-one differentiable map from Θ onto Ω = µ(θ). Notice that, for a regular NEF, Ω coincides with the interior of the convex hull of the support of ν. The matrix-valued function V(µ), whose ij-th element is 2 M(θ) θ i θ j, θ=θ(µ) is called the variance function of the family F. NEF F. The pair (V( ), Ω) characterizes the When the family F is real, its variance function is said to be quadratic if V (µ) = Qµ 2 + Lµ + C, for some Q, L, C R such that V (µ) > 0 for all µ Ω. The class of real NEFs with quadratic variance function includes some of the most widely used families of distributions, such as the normal (with known variance), binomial, Poisson, gamma (with known shape parameter) and negative-binomial, see Morris (1982, 1983). 5

6 When F is defined on R d, the notion of quadratic variance function can be extended in various ways (Letac, 1991). An important case is given by the simple quadratic variance function (SQVF) V(µ) whose ij-th element is V ij (µ) = qµ i µ j + d µ k L (k) ij + C ij, where q is a real constant and L (k), k = 1,..., d and C are constant d d symmetric matrices. Casalis (1996) showed that any NEF-SQVF can be obtained, via a nonsingular affine transformation, from one of the basic families: Poisson/normal, multinomial, negativemultinomial, negative-multinomial/gamma/normal and negative-multinomial/hyperbolicsecant. The ij-th element of the variance function V(µ) of a basic NEF-SQVF is V ij (µ) = qµ i µ j + d where q R, L (0) and C are d d constant symmetric matrices. L (0) ik µ k + C ij, (3) Consider now n random vectors X 1,..., X n independent and identically distributed (i.i.d.) according to the density p θ (x) in (2). Then S n = n i=1 X i is the minimal sufficient statistic for the sample and its density, with respect to the convolution measure ν n of ν, is { d } p n,θ (s) = exp θ k s k nm(θ), θ Θ. (4) Thus, S n is still distributed according to a NEF with natural parameter θ; the corresponding distribution function will be denoted by F n,θ. 2.2 Conditionally reducible natural exponential families A relevant role in our development of fiducial inference for multivariate parameters is played by the so-called conditionally reducible NEFs (in the sequel cr-nefs), introduced in Consonni & Veronese (2001). We give here a brief overview, considering only a particular case of cr-nefs, and refer to the Appendix A1 and to the cited paper for more technical and general results. In the following, given a vector y = (y 1,..., y d ), we denote by y [k] the sub-vector (y 1,..., y k ), k = 1,..., d 1. Let X be a random vector distributed according to a NEF F on R d, whose density with respect to ν is given in (2). F is a cr-nef if, for each k = 1,..., d, the conditional distribution of X k given X [k 1] = x [k 1] is a real exponential family with respect to a 6

7 suitable transition kernel. Notice that, for k = 1, we have the marginal distribution of X 1. Thus the joint density of a cr-nef can be factorized as follows: p θ (x 1,..., x d θ) = p ϕ (x 1,..., x d ϕ(θ)) = = p ϕk (x k x [k 1] ; ϕ k (θ)) exp { ϕ k (θ)x k M k (ϕ k (θ); x [k 1] ) }, (5) where ϕ = (ϕ 1,..., ϕ d ) is a one-to-one function from Θ onto ϕ(θ) = Φ. Furthermore, it can be shown that Φ = Φ 1 Φ d, with ϕ k Φ k, k = 1,..., d, so that the ϕ k s are variation independent. The parameter ϕ k represents the natural parameter associated with the k-th conditional distribution. All basic NEF-SQVFs are cr-nefs; their structure and the relationships among different parametrizations are given in Consonni & Veronese (2001, Table 1). Example 1 (Multinomial family). Consider the vector X distributed according to the multinomial distribution on R d { d } ( ) N p θ (x) = exp θ k x k M(θ), (6) x 1,..., x d+1 where: M(θ) = N ln(1 + d eθ k), Θ = R d, x d+1 = N d x k, d x k N, x k is a non-negative integer and θ k = log(p k /(1 d r=1 p r)), with p k the probability of the k-th outcome, k = 1,..., d. It is well known that the conditional distribution of X k (X [k 1] = x [k 1] ), k = 2,..., d, is Bi(N k 1 j=1 x j, p k /(1 k 1 j=1 p j)), whereas the marginal distribution of X 1 is Bi(N, p 1 ), where Bi(n, p) denotes the binomial distribution with n trials and success probability p. Since the binomial family is a NEF with natural parameter equal to the logit of the probability of success, one can factorize the family as in (5) with ϕ k = log p k 1 k j=1 p, ϕ k Φ k = R, k = 1,..., d. (7) j Notice that the parameter ϕ is specific to a given order of the vector components. Thus, considering a permutation of (X 1,..., X d ), the resulting distribution is still multinomial and thus conditionally reducible, but with a different ϕ-parameterization. This aspect is important in the construction of both the reference priors and the fiducial distributions, as we will see. 2.3 Objective bayesian inference for natural exponential families As recalled in Section 1, the reference prior is the most successful prior used in the objective Bayesian inference. Typically, for a real parameter admitting an asymptotically 7

8 normal posterior distribution, it coincides with the prior obtained by the well known Jeffreys rule. This happens for a real NEF F, with density (2), for which the Jeffreys prior π J for θ is π J (θ) I(θ) 1/2 = [M (θ)] 1/2 where ( ) 2 I(θ) = E θ θ 2 log p θ(x) = M (θ) represents the Fisher information. The Jeffreys prior is invariant under a reparameterization of the model and for real NEFs with quadratic variance function it belongs to the standard conjugate family, π J (θ) exp{θs n M(θ)}, with s = L/2 and n = Q, where L and Q are the coefficients defined in V (µ), see Gutiérrez-Peña & Smith (1997). For a multidimensional parameter the reference prior π R depends on the grouping and on the ordering of its components and, in general, it no longer coincides with π J, which is known to be unsatisfactory. The reference prior is not simple to derive in a general context, but for cr-nef-sqvfs Consonni et al. (2004, Prop. 2) found a simple and general formula for π R (ϕ). Furthermore, this is invariant with respect to the ordering of the groups and belongs to the enriched standard conjugate family, defined in Consonni & Veronese (2001). This aspect is important for several reasons, not least because the computation of the posterior can be obtained directly modifying the hyper-parameters of the family, see Appendix A2 for a very basic review on this topic. If F is one of the basic NEF-SQVFs, then the d-group reference prior for ϕ is π R (ϕ) { } 1 exp 2 ϕ kz k + qb k (ϕ k ), (8) where z k = L (0) kk is the k-th element of the diagonal of the matrix L(0) given in (3), and B k (ϕ k ) is defined in (39). The following proposition, besides being useful to simplify the computation of the reference priors for cr-nefs, also establishes a connections between reference and Jeffreys priors and, in our context, between objective Bayesian and fiducial inference. Proposition 1 Let F be a cr-nef on R d, with the k-th diagonal element in the Fisher information matrix given by I kk (ϕ) = a k (ϕ k )b k (ϕ [k 1] ). Then the d-group (order-invariant) reference prior π R for ϕ = (ϕ 1,..., ϕ d ) is π R (ϕ) = πk J (ϕ k) (a k (ϕ k )) 1/2, (9) where π J k (ϕ k) is the Jeffreys priors obtained from the conditional distribution of X k given X [k 1] = x [k 1]. 8

9 Finally, we recall that the reference priors are not invariant under arbitrary reparameterizations of the model. Thus, given the reference prior for ϕ, in general it is not possible to recover the reference prior for an alternative parameter of the model, λ say, via the standard change-of-variable technique. The procedure is correct only when the Jacobian matrix of the transformation from ϕ to λ is lower triangular, i.e. λ 1 = g 1 (ϕ 1 ), λ 2 = g 2 (ϕ 1, ϕ 2 ),..., see Yang (1995) and Datta & Ghosh (1996). This result is reasonable for such transformations preserve the order of inferential importance of the parameters, which is a crucial aspect of the reference priors. Notice that the transformation from ϕ to µ is lower triangular, as shown in (40), and thus π R (µ) can be easily derived from π R (ϕ). However, the reference priors for µ is not order-invariant, unlike that for ϕ. Example 1 (ctd.). As previously noted, a multinomial distribution can be factorized as a product of (conditional) binomial distributions with natural parameter ϕ k. Thus it is immediate to verify that the Jeffreys prior π J k for ϕ k is π J k (ϕ k) eϕ k/2 1 + e ϕ k, ϕ k R. The reference prior for ϕ, obtained by (8) noting that z k = 1 and B k (ϕ k ) = N log(1+e ϕ k), see Consonni & Veronese (2001, Table 1), is π R (ϕ) { } 1 exp 2 ϕ k log(1 + e ϕ k ) = e ϕ k/2 1 + e ϕ k, ϕ Rd, (10) and coincides with d πj k (ϕ k) as proved in Proposition 1. Because the cell-probabilities parameter is proportional to the mean parameter, p = Nµ, we can compute directly the reference prior on p = (p 1,..., p d ), with this order, from (10). Using (7), and noting that the Jacobian of the transformation is J ϕ (p) = d i=1 p 1 i (1 d i=1 p i) 1, we have 1/2 k π R (p) p k 1 p j. (11) The reference prior π R for p is a generalized Dirichlet distribution and belongs to the enriched conjugate family for the multinomial model, see Consonni & Veronese (2001). j=1 3 Fiducial distributions: a proposal of construction and some examples In this section we introduce and discuss, with a particular eye towards connections and similarities with objective Bayesian analysis, a way to construct fiducial distributions 9

10 for some multiparameter statistical models. First we recall some results holding for real NEFs. 3.1 Fiducial distributions for real NEFs The fiducial distributions for real NEFs have been constructed in a quite simple and direct way by V&M starting from a result in Petrone & Veronese (2010). Given a sufficient statistic S n with distribution function F n,θ (s), belonging to a real NEF F with density (4) and support S n, let a n = inf S n, b n = sup S n and define S n = [a n, b n ) if ν n (a n ) > 0, otherwise Sn = (a n, b n ). Then, for s Sn, 0 θ inf Θ H n,s (θ) = 1 F n,θ (s) inf Θ < θ < sup Θ 1 θ sup Θ is a fiducial distribution function (according to Fisher idea) for the natural parameter θ. It follows that the fiducial density of θ is h n,s (θ) = θ H n,s(θ) = θ F n,θ(s) = (,s] (12) (nm (θ) t)p n,θ (t)dν n (t). (13) It is simple to verify that the distribution function H n,s is also a (possibly asymptotic) confidence distribution, according to its modern definition given in Schweder & Hjort (2002) and Singh et al. (2005). For discrete NEFs, F n,θ (s) = Pr θ {S n s} and Pr θ {S n < s} do not coincide and thus, besides H n,s in (12), one could define a left fiducial distribution as H l n,s(θ) = 1 Pr θ {S n < s} = 1 Pr θ {S n s 1 } = H n,s 1 (θ), (14) where s 1 denotes the point before s in the support of S n. For convenience, sometimes H n,s will be called right fiducial distribution. A natural way to overcome this nonuniqueness issue might be considering their arithmetic mean, i.e. the mixture, H A n,s(θ) = (H n,s (θ) + H l n,s(θ))/2 = Pr θ {S n > s} + Pr θ {S n = s}/2, whose density is the arithmetic mean of h n,s (θ) and h l n,s(θ). Remarkably, H A n,s coincides with the approximate confidence distribution proposed for discrete data by Schweder & Hjort (2002); see also Hannig & Xie (2012). The distribution H n,s in (12) is well defined for each s S n, but it fails for s = b n, with b n finite and ν n (b n ) > 0, since H n,bn (θ) = 1 F n,θ (b n ) = 0 for each θ Θ. A similar problem exists also for H l n,s, as it occurs for instance in the binomial model. Thus both H n,s and H l n,s, and hence their mixture, can be undefined in specific cases. A possible solution is to consider, instead of the arithmetic mean of h n,s and h l n,s, their geometric mean (suitably normalized), defined as / h G n,s(θ) = h n,s (θ)h l n,s(θ) h n,s (θ)h l n,s(θ)dθ. 10

11 We denote by Hn,s G the distribution function corresponding to h G n,s. Recently Berger, Bernardo & Sun (2015) suggest to use the geometric mean as a reasonable way to average different reference priors, because it is not affected by the normalizing constant, often not existing for reference priors. Furthermore they mentioned its property of minimizing the Kullback-Leibler divergence, attributing this remark to Gauri Datta. We give a simple proof of this fact, without resorting to the calculus of variations. First recall that, given two densities p and q with the same support, the Kullback-Leibler divergence of p from q is defined as KL(q p) = q(x) log(q(x)/p(x))dx. Proposition 2 Consider two densities p 1 and p 2 with the same support. Then the density q = p G (p 1 p 2 ) 1/2 minimizes the sum of the Kullback-Leibler divergences of p 1 and p 2 from q. The previous proposition, with other advantages which will be clarified later on, justifies the preference given to Hn,s G with respect to Hn,s A to combine H n,s and Hn,s. l Real fiducial distributions are invariant under monotone continuous reparameterizations of the model. More precisely, if λ = λ(θ) is an increasing differentiable function of θ, then the fiducial distribution function of λ is Hn,s(λ) λ = 1 F n,θ(λ) (s) = H n,s (θ(λ)). The corresponding density h λ n,s(λ) coincides with that obtained directly via a changeof-variable technique starting from h n,s (θ). A similar result holds if λ is decreasing in θ. Table 1 provides the fiducial distributions for some important NEFs obtained in V&M and used in the sequel Connections with Jeffreys priors As mentioned in Section 1, Lindley (1958) was the first to study when there exists a prior (called by V&M fiducial prior) whose corresponding posterior coincides with a given fiducial distribution. In particular he proved that, within continuous NEFs, a fiducial prior exists only for gaussian (with known variance) and gamma (with known shape) models. A full characterization of all real NEFs admitting a fiducial prior is given by V&M. The following proposition summarizes their results, which will be useful later on for comparison purposes. Proposition 3 Let F be a real NEF with natural parameter θ. i) If a fiducial prior for θ exists, then F has quadratic variance function. ii) A fiducial prior exists if and only if F is an affine transformation of one of the following families: normal with known variance, gamma with known shape param- 11

12 Table 1: Fiducial distributions for some real NEFs Sufficient Fiducial statistic distributions N(µ, σ 2 ) S n = i X i H n,s (µ) : N(s/n, σ 2 /n) (σ 2 known) N(µ, σ 2 ) S n = i (X i µ) 2 H n,s (σ 2 ): In-Ga(n/2, s/2) (µ known) Ga(α, λ) S n = i X i H n,s (λ) : Ga(nα, s) (α known) Pa(λ, x 0 ) S n = i log(x i/x 0 ) H n,s (λ) : Ga(n, s) (x 0 known) We(λ, c) S n = i Xc i H n,s (λ) : Ga(n, s) (c known) Bi(m, p) S n = i X i H n,s (p) : Be(s + 1, nm s) (m known) H l n,s(p) : Be(s, nm s + 1) H G n,s(p) : Be(s + 1/2, nm s + 1/2) Po(µ) S n = i X i H n,s (µ) : Ga(s + 1, n) H l n,s(µ) : Ga(s, n) H G n,s(µ) : Ga(s + 1/2, n) Ne-Bi(m, p) S n = i X i H n,s (p) : Be(nm, s + 1) (m known) Hn,s(p) l : Be(nm, s) Hn,s(p) G : Be(nm, s + 1/2) The following notations are used: Ga(α, λ) for a gamma distribution with shape α and mean α/λ; In-Ga(α, λ) for an inverse-gamma distribution (if X Ga(α, λ) then 1/X In-Ga(α, λ)); Be(α, β) for a beta distribution with parameters α and β; Bi(m, p) for a binomial distribution with m trials and success probability p; Ne-Bi(m, p) for a negative-binomial with m successes and success probability p; Po(µ) for the Poisson distribuition with mean µ; Pa(λ, x 0 ) for a Pareto distribution with density λx λ 0 x λ 1, x > x 0 > 0, λ > 0; We(λ, c) for a Weibull distribution with density cλx c 1 exp( λx c ), x > 0, λ > 0, c > 0. 12

13 eter, binomial, Poisson and negative-binomial. For the three discrete families, the fiducial prior exists for all H n,s, Hn,s l and Hn,s. G iii) When a fiducial prior exists, it belongs to the family of conjugate distributions. Moreover, it coincides with the Jeffreys prior for continuous F and also for discrete F if we choose Hn,s G as the fiducial distribution. iv) The fiducial distribution H n,s (or Hn,s A in the discrete case) and the Bayesian posterior distribution corresponding to the Jeffreys prior have the same Edgeworth s expansions up to the term of order n 1. The previous results establish a strong connections, in the setting of real NEFs, between Jeffreys posteriors and fiducial distributions and thus the two different approaches lead, in some sense, to the same objective inference. For a discussion about the coverage of the fiducial and Jeffreys intervals and their good frequentist properties, in particular when compared with the standard Wald intervals for real NEFs, see V&M and references within. 3.2 Fiducial distributions in a more general context In this section we suggest a possible and natural way to extend the previous results to more complex models. As mentioned in Section 1, the idea is to construct a joint fiducial distribution as a product of (conditional) fiducial distributions obtained by the conditional densities in which the sampling distribution has been factorized. This type of construction mimics the one used to obtain reference priors in a multivariate context and, as a consequence, also our fiducial distributions depend on the chosen order of the conditioning. As far as we know, this is the first time that this aspect is considered in a fiducial context. In the following we will always assume that the one-dimensional conditional distribution function F θ (x y) of the random quantities involved in the analysis satisfies the conditions for which F θ (x y)/ θ is a density of θ for fixed x and y. Of course this is true, by (12), if F θ (x y) belongs to a NEF. Consider a statistic T = (T 1,..., T m ) with density p θ (t), θ = (θ 1,..., θ d ), d m, which summarizes the data X without loosing information on θ. T can be a sufficient statistic or a one-to-one transformation of X. Suppose that T can be split as (T [d], T [d] ), where T [d] = (T 1,..., T d ) and with T [d] = (T d+1,..., T m ) ancillary for θ. Then the inference on θ can be performed using the conditional distribution of T [d] given T [d]. Assume now that there exists a one-to-one smooth reparameterization from θ to ϕ, with 13

14 ϕ 1,..., ϕ d representing the inferential importance order of the component of ϕ, such that p ϕ (t [d] t [d] ) = p ϕd k+1 (t k t [k 1], t [d] ; ϕ [d k] ). (15) The density p ϕd k+1 (t k t [k 1], t [d] ; ϕ [d k] ), with the corresponding distribution function F ϕd k+1 (t k t [k 1], t [d] ; ϕ [d k] ), must be interpreted as the conditional distribution of T k given T [k 1] = t [k 1], T [d] = t [d], parameterized by ϕ d k+1, assuming ϕ [d k] known. Then we can construct the fiducial density of ϕ as h t (ϕ) = h t[k],t [d] (ϕ d k+1 ϕ [d k] ), (16) where t = (t [d], t [d] ) and h t[k],t [d] (ϕ d k+1 ϕ [d k] ) = F ϕd k+1 (t k t ϕ [k 1], t [d] ; ϕ [d k] ) d k+1. (17) Before providing several applications of the previous procedure, it is convenient to discuss some general points related to the fiducial distribution (16). The existence of an ancillary statistic T [d] is not required if m = d, i.e. when the dimensions of T and ϕ coincide. The fiducial distribution (16) is essentially invariant with respect to one-to-one transformations of the statistic T. More precisely, because all distributions are conditioned on the ancillary statistic, any arbitrary one-to-one transformation of T [d] establishes the same constraint on the distributions. Moreover, also the choice of T [d] does not affect the resulting fiducial distribution if we consider a one-to-one transformation T = (T [d], T [d]), where T [d] = g(t [d], T [d] ) is a lower triangular transformation of T [d] for fixed T [d], that is T k = g k(t [k], T [d] ) for k = 1,..., d. For t k = g k(t [k], t [d] ), with for instance g k (t [k], t [d] ) increasing in t k, we have Pr ϕd k+1 (T k t k T [k 1] = t [k 1], T [d] = t [d] ; ϕ [d k] ) = Pr ϕd k+1 (g k (T [k], T [d] ) g k (t [k], t [d] ) T [k 1] = t [k 1], T [d] = t [d] ; ϕ [d k] ) = Pr ϕd k+1 (T k t k T [k 1] = t [k 1], T [d] = t [d] ; ϕ [d k] ), so that T and T lead to the same fiducial distribution. If one is interested only in ϕ 1, it follows from (15) that it is enough to consider h t[d],t [d] (ϕ 1 ) = F ϕ1 (t d t ϕ [d 1], t [d] ) 1. which depends on all observations. Similarly, if one is interested in ϕ 1, ϕ 2 then it is enough to consider h t[d],t [d] (ϕ 1 ) h t[d 1],t [d] (ϕ 2 ϕ 1 ), and so on. 14

15 If (T [k 1], T [d] ) is sufficient for ϕ [d k], for each k, then the conditional distribution of T k given T [k 1] = t [k 1], T [d] = t [d] does not depend on ϕ [d k] and the fiducial distribution (16) becomes the product of the marginal fiducial distributions of the ϕ k s. As a consequence, the k-th factor in (16) can be used alone to make inference on ϕ d k+1 and the fiducial distribution is independent on the inferential ordering of the parameters. An important case in which this happens will be discussed in Section 4. As we have seen in section 3.1, for discrete NEFs it is possible to define a right and a left fiducial distribution starting from Pr ϕd k+1 {T k t k T [k 1] = t [k 1], T [d] = t [d] ; ϕ [d k] } = F ϕd k+1 (t k t [k 1], t [d] ; ϕ [d k] ), and Pr ϕd k+1 {T k < t k T [k 1] = t [k 1], T [d] = t [d] ; ϕ [d k] }, respectively. Thus we could define 2 d different fiducial distributions taking for each factor of (16) one of the two previous choices. In the following we will consider only four natural cases: the fiducial distribution function H t (ϕ) obtained as the product of all the right univariate conditional fiducial distributions, Ht l (ϕ) obtained as the product of all the left univariate conditional fiducial distributions, and other two motivated by the considerations developed in Section 3.1. The first, Ht A (ϕ), is obtained as the product of the d mixtures Ht A [k],t [d] = (H t[k],t [d] + Ht l [k],t [d] )/2, while the second, Ht G(ϕ), corresponds to the density hg t (ϕ) obtained as the product of the d geometric means h G t [k],t [d] (h t[k],t [d] h l t [k],t [d] ) 1/2. Notice that h G t (ϕ) also coincides with the geometric mean of all the 2 d left and right conditional fiducial densities. 3.3 Connections with reference priors The construction by conditioning is the key point of the fiducial distribution h t (ϕ) in (16) and thus it depends on the order of the components of ϕ. This is the reason why we assumed that ϕ 1,..., ϕ d are ordered according to their inferential importance. This is similar to what happens in the objective Bayesian analysis, where it s well known that the reference priors, and thus the reference posteriors, of a multidimensional parameter generally depend on the ordering, see Bernardo & Smith (1994, sec ). Indeed, the reference prior π R (ϕ) for a parameter ϕ = (ϕ 1,..., ϕ d ) is generated by successive conditioning as π R (ϕ) = π R (ϕ d ϕ [d 1] ) π R (ϕ 2 ϕ 1 )π R (ϕ 1 ) = d πr (ϕ d k+1 ϕ [d k] ), and our proposal, given in (16), mimics this construction. We observe that in an objective Bayesian context, this aspect is seen as a positive feature of the procedure: the dependence of the reference prior on the quantity of interest has proved necessary to obtain posteriors with appropriate properties - in particular, to have good frequentist coverage properties (when attainable) and to avoid marginalization paradoxes and strong inconsistencies (Berger, Bernardo & Sun, 2015). Thus the reparameterization from θ to ϕ used 15

16 in (15) has a double aim. On one hand it allows the parameter of interest to appear explicitly, if it is not directly a component of θ, on the other hand it can be needed to achieve the structure of the conditional distributions in (15). The fiducial distribution (16) is in general not invariant to reparameterizations. Note that this is true also for the reference posteriors unless the transformation from ϕ to λ = (λ 1,..., λ d ), say, maintains the same increasing ordering of importance in the components of the two vectors and λ k is a function of ϕ 1,..., ϕ k, for each k = 1,..., d, i.e. ϕ(λ) is a lower triangular transformation, see Yang (1995) and Datta & Ghosh (1996). This result holds also for fiducial distributions as the following proposition shows. Proposition 4 If ϕ = ϕ(λ) is a one-to-one lower triangular continuously differentiable function from Λ to Φ, then the fiducial distribution h ϕ t (ϕ), obtained applying (16) to the model p ϕ (t), and the fiducial distribution h λ t (λ), obtained applying (16) to the model p λ (t) = p ϕ(λ) (t), are such that, for each measurable A Φ, 3.4 Examples A h ϕ t (ϕ)dϕ = λ 1 (A) h λ t (λ)dλ. (18) This section shows how the suggested procedure to construct fiducial distributions can be fruitfully applied to several classical problems. For discrete models we always choose H G t as fiducial distribution. We will see that, in many cases, the fiducial distribution and the reference posterior coincide. This latter is generated by a reference prior which can be seen as the product of the Jeffreys priors derived from the conditional distributions of the data, as seen in Proposition Location-scale parameter models Consider first the case in which only one parameter, θ, is unknown. These models admit an ancillary statistic Z and, in particular, we assume Z i = X i X 1 or Z i = X i /X 1, i = 2,..., n, if θ is a location or a scale parameter, respectively. The following proposition characterizes the fiducial distribution for θ and establishes the equivalence with objective Bayesian inference. Proposition 5 Let X = (X 1,..., X n ) be an i.i.d. sample of size n from a density p θ, θ Θ R. If θ is a location or a scale parameter, then the fiducial distribution coincides with the Bayesian posterior obtained with the Jeffreys prior π J (θ) 1 or π J (θ) 1/θ, respectively. 16

17 Example 2. Let X be an i.i.d. sample from a uniform distribution on (0, θ), θ > 0, so that θ is a scale parameter. First notice that S = max(x 1,..., X n ) is a sufficient statistic for θ and thus we can obtain directly the fiducial distribution h s (θ) = θ H s(θ) = θ F θ(s) = θ ( s θ ) n = ns n, θ > s. (19) θn+1 Set w = max(z 2,..., z n ) and consider the distribution function of X 1 given the ancillary statistic Z = X 2 /X 1,..., X n /X 1 { ( x1 ) n F θ (x 1 z) = θ 0 < x 1 < θ, 0 < w 1 ) n 0 < x 1 < θ, w > 1 ( x1 w θ. (20) Now, because w 1 means x 1 = max(x 1,..., x n ), while for w > 1 we have x 1 w = max(x 2,..., x n ), expression (20), as a function of θ, is equivalent to that in (19) and thus provides the same fiducial distribution, which trivially coincides with the Jeffreys posterior. Note that if a one-dimensional sufficient statistic does not exist, only the second way can be applied to obtain the result. Example 3. Let X be an i.i.d. sample from a uniform distribution on (θ, θ + 1), θ R, so that θ is a location parameter. There exists a sufficient statistic S = (S 1 = min(x 1,..., X n ), S 2 = max(x 1,..., X n )), but it is not one-dimensional. Thus we can compute the fiducial distribution starting from the distribution function of S 2 given the ancillary statistic Z = S 2 S 1. Specifically we have h s (θ) = θ H s(θ) = θ F θ(s 2 z) = θ which coincides with π J (θ x). s 2 z θ 1 z = 1 1 z s 2 1 < θ < s 2 z = s 1, Example 4. Let X be an i.i.d. sample from a logistic distribution (or a Cauchy distribution) with know scale parameter. In these case a sufficient statistic does not exist and then the procedure to construct the fiducial distribution cannot be simplified. Using Proposition 5, we can write it as a Bayesian posterior obtained by a constant prior. Thus for the logistic distribution we have h x (θ) n i=1 exp{ (x i θ)/σ} σ(1 + exp{ (x i θ)/σ}) 2, whose normalizing constant can be easily computed numerically or via simulation. We can proceed similarly for the Cauchy distribution. Consider now a model with a location parameter θ and a scale parameter σ, both unknown. Given an i.i.d. sample of size n, an ancillary statistic is, for example, Z = 17

18 (Z 3,..., Z n ), with Z j = (X j X 1 )/Z 2, j = 3,..., n, where Z 2 = X 2 X 1 is marginally ancillary for θ. Then, the one-to-one transformation from X to (X 1, Z 2, Z) allows us to write the sampling distribution as p θ,σ (x 1 z 2, z) p σ (z 2 z) p(z), so that the fiducial distribution for (σ, θ) can be obtained using only the first two factors of the product. Note that in specific contexts other transformations could be more appropriate. For example, in a normal model it could be more reasonable to use ( X = n i=1 X i/n, S 2 = n i=1 (X i X) 2, Z) with Z j = (X j X)/S, j = 3,..., n, so that the factorization becomes p θ,σ ( x s 2, z) p σ (s 2 z) p(z). Because X and S 2 are complete sufficient statistics and are independent, it follows that, in this case, the fiducial distribution is obtained starting from p θ,σ ( x) p σ (s 2 ), as suggested in the Section 1. Proposition 6 Let X = (X 1,..., X n ) be an i.i.d. sample from a density p θ,σ, where θ and σ represent a location and a scale parameter, respectively. Then the fiducial distribution for (σ, θ) coincides with the Bayesian posterior obtained with the reference prior πσ,θ R (σ, θ) 1/σ. Notice that π R (σ, θ) 1/σ is different from π J (σ, θ) 1/σ 2 obtained by the Jeffreys rule. Furthermore, π R does not depend on the order of θ and σ, while our procedure applies only for (σ, θ). However, our fiducial distribution corresponds to that derived by Fisher (1973, Sec. 6.8) for the normal model and to those obtained through other symmetric fiducial approaches, see Hannig (2009) and Fraser (1961). model the inferential order of importance seems irrelevant. Thus in this Examples concerning normal models i) Difference of means. Consider two independent normal i.i.d. samples, each of size n, with known common variance σ 2 and means µ 1 and µ 2 respectively. The sufficient statistics are the samples sums S 1 and S 2, with S i N(nµ i, nσ 2 ), i = 1, 2. To make inference on δ = µ 2 µ 1, we reparameterize the joint density of (S 1, S 2 ) in (δ = µ 2 µ 1, λ = µ 1 ), and we obtain p δ,λ (s 1, s 2 ) = 1 2πnσ 2 exp ( s2 1 + s2 2 2nσ 2 ) { δs2 exp σ 2 + λ(s 1 + s 2 ) σ 2 n(2λ2 + δ 2 } + 2λδ) 2σ 2. It follows that the conditional distribution of S 2 given S 1 +S 2 is N((nδ+s 1 +s 2 )/2, nσ 2 /2). From Table 1, the fiducial distribution of δ/2 + (s 1 + s 2 )/(2n) is N(s 2 /n, σ 2 /(2n)), so that δ is N( x 2 x 1, 2σ 2 /n), where x i = s i /n, while the fiducial distribution of λ given δ, derived from the marginal distribution of S 2, is N( x 2 δ, σ 2 /n). The joint fiducial distribution of η and λ then follows. The same fiducial distribution for δ = µ 2 µ 1 is obtained starting from the two independent marginal fiducial distributions of µ 1 and µ Indeed, from Table 1, the

19 fiducial distribution of µ i, obtained from S i, is N( x i, σ 2 /n), i = 1, 2 so that a direct transformation implies that δ is N( x 2 x 1, 2σ 2 /n), as before. It is worth to remark that this phenomenon (i.e., the coincidence of the two fiducial distributions of the same parameter, obtained starting from different parameters of interest) is specific to this, and other examples, but it has not at all a general validity, as it is shown for instance in Section ii) Many normal means (Neyman Scott, 1948). Consider n samples of size two (X i1, X i2 ), with each X ij independently distributed according to a N(µ i, σ 2 ), i = 1,..., n. The aim is to make inference on the common variance σ 2, with nuisance parameter (µ 1,..., µ n ). Let X i = (X i1 + X i2 )/2 and W = n i=1 (X i1 X i2 ) 2. This well known example it is used to show that the standard maximum likelihood estimator and that based on the profile likelihood are both equal to ˆσ 2 = W/(2n), which is inconsistent because W/(2n) σ 2 /2, n. To obtain the fiducial distribution of σ 2, first notice that the joint distribution of the sufficient statistic ( X = ( X 1,..., X n ), W ) can be factorized as ( n ) p µi,σ 2( x i) p σ 2(w), (21) i=1 for the independence of the X i s and W. From the results in Table 1, it follows that the fiducial distribution of each µ i given σ 2, derived from p µi,σ 2( x i), is N( x i, σ 2 /2) and that of σ 2, obtained from p σ 2(w), is In-Ga(n/2, w/4). As a consequence ( n ) h x,w (µ, σ 2 ) = h xi (µ i σ 2 ) h w (σ 2 ). (22) i=1 It is interesting to observe that the fiducial distribution (22) is equal to the posterior obtained from the reference prior, independent of the ordering and grouping, π R (σ 2, µ 1,..., µ n ) 1/σ 2. This distribution does not present the inconsistency of the likelihood estimator, which instead occurs using the Jeffreys prior π J (σ 2, µ 1,..., µ n ) 1/σ n Comparison of two Poisson rates The comparison of the rates µ 1 and µ 2 of two independent Poisson distributions is a classical problem arising in many contexts. For instance, an unbiased uniformly most powerful test for the ratio η = µ 2 /µ 1 is discussed in Lehmann (2005). Given two i.i.d. samples of size n from two independent Poisson distributions, the sufficient statistics are the sample sums S 1 and S 2, with S i Po(nµ i ), i = 1, 2. Reparameterizing the joint density of (S 1, S 2 ) in (η = µ 2 /µ 1, λ = µ 1 ), we have p η,λ (s 1, s 2 ) = ns 1+s 2 s 1!s 2! exp{s 2 log η + (s 1 + s 2 ) log λ nλ(1 + η)}. 19

20 The conditional distribution of S 2 given S 1 +S 2 is Bi(s 1 +s 2, η/(1+η)) and the marginal distribution of S 1 +S 2 is Po(nλ(1+η)). Using Table 1, the fiducial density h G of η/(1+η), derived from the conditional distribution, is Be(s 2 + 1/2, s 1 + 1/2) which induces on η h G s 1,s 2 (η) = 1 B(s 2 + 1/2, s 1 + 1/2) ηs 2 1/2 (1 + η) s 1 s 2 1, η > 0, (23) where B(, ) denotes the beta function. From the marginal distribution of S 1 + S 2 and using again Table 1, it follows that h G s 1,s 2 (λ η) is Ga(s 1 + s 2 + 1/2, n(1 + η)) and thus the joint fiducial distribution of η and λ is h G s 1,s 2 (η, λ) = h G s 1,s 2 (λ η)h G s 1,s 2 (η). The fiducial distribution (23) of η coincides with both the reference and Jeffreys posterior distributions. However it is interesting, and easy to verify, that h G s 1,s 2 (η, λ) is equal to the reference posterior on (η, λ), but is different from the Jeffreys posterior, which instead coincides with the fiducial distribution induced on η by the two independent marginal fiducial densities of µ 1 and µ Bivariate binomial A Bayesian analysis for the bivariate binomial model has been discussed by Crowder & Sweeting (1989) in connection with a microbiological application. Consider m spores, each with a probability p to germinate, and denote by R the random number of germinating spores, so that R is Bi(m, p). If q is the probability that one of the latter spores bends in a particular direction and S is the random number of them, the probability distribution of S given R = r is Bi(r, q). The joint distribution of R and S is called bivariate binomial. Crowder and Sweeting observe that the Jeffreys prior π J (p, q) p 1 (1 p) 1/2 q 1/2 (1 q) 1/2 is not satisfactory for its asymmetry in p and 1 p, while Polson & Wasserman (1990) show that this fact does not occur using the (order invariant) reference prior π R (p, q) p 1/2 (1 p) 1/2 q 1/2 (1 q) 1/2, which is the product of the two independent Jeffreys priors for p and q. The symmetry condition is satisfied also by the (order invariant) reference prior on the alternative parametrization (η = pq, λ = p(1 q)/(1 pq)), which is π R (η, λ) η 1/2 (1 η) 1/2 λ 1/2 (1 λ) 1/2. The joint fiducial density h G r,s(q, p) can be obtained, as usual, as the product of h G r,s(q), derived from the conditional model Bi(r, q), and h G r (p q), derived from the marginal model Bi(m, p) of R, which does not depend on q, so that p and q are independent under h G r,s. This fact makes the fiducial distribution order invariant, as seen for the reference prior. Because for a binomial model there exists the fiducial prior Be(1/2, 1/2), equivalent to the Jeffreys prior, see Proposition 3, it follows immediately that the fiducial distribution of (p, q) coincides with the reference posterior. If the parameter of interest is η, or λ, we can reparameterize the model writing it in terms of the distribution of R S given 20

21 S = s, which is Bi(m s, λ), and that of S, which is Bi(m, η) and prove again that the joint fiducial distribution and reference posterior are both order-invariant and coincide Ratio of parameters of a trinomial distribution Bernardo & Ramon (1998) perform the Bayesian reference analysis for the ratio of two multinomial parameters, discussing in details the case of the trinomial distribution and presenting some applications. Consider n observations belonging to one out of three categories and denote by X i, i = 1, 2, 3, the number of occurrences in category i. Then the joint distribution of X 1 and X 2 is trinomial, with parameters p 1 and p 2 representing the probabilities associated with the first two categories. Bernardo & Ramon (1998, formula (7) and (8)) show that, if the parameter of interest is η = p 1 /p 2 and λ = p 2 is the nuisance parameter, then the (proper) reference prior of (η, λ) is π R (η, λ) [η(1 + η)λ(1 + λ(1 + η))] 1/2, 0 < λ < (1 + η) 1, 0 < η < +, (24) so that the marginal reference posterior of η is π R (η x 1, x 2 ) η x1 1/2 (1 + η) x 1 x 2 1. (25) To find the fiducial distribution of η, let us reparameterize the trinomial model in (η, λ): ( ) n p η,λ (x 1, x 2 ) = exp{x 1 log η + (x 1 + x 2 ) log(λ/(1 ηλ λ)) + n log(1 ηλ λ)}. x 1, x 2, x 3 The conditional distribution of X 1 given T = X 1 + X 2 = t is Bi(t; η/(1 + η)), while the distribution of T is Bi(n; λ(1 + η)). Thus, Table 1 shows that the fiducial density h G x 1,t of η/(1 + η) is Be(x 1 + 1/2, t x 1 + 1/2), which induces a fiducial density of η equal to (25). Of course, the joint fiducial density of (η, λ) is given by h G x 1,x 2 (η, λ) = h G x 1,t(η)h G t (λ η), which coincides with the reference posterior obtained by (24). 4 Fiducial distributions for conditionally reducible NEFs 4.1 Properties and examples The construction of the fiducial distribution proposed in Section 3.2 becomes much simpler when we consider a model belonging to a cr-nef. In this case there exists a sufficient statistic of the same dimension of the parameter, so that there is no need of an ancillary statistic, while the ϕ-parameterization, indexing each conditional distribution with a real parameter, implies the independence of the ϕ k s under the fiducial distribution. 21

Fiducial and Confidence Distributions for Real Exponential Families

Scandinavian Journal of Statistics doi: 10.1111/sjos.12117 Published by Wiley Publishing Ltd. Fiducial and Confidence Distributions for Real Exponential Families PIERO VERONESE and EUGENIO MELILLI Department