Approximations of marginal tail probabilities for a class of smooth functions with applications to Bayesian and conditional inference

Size: px
Start display at page:

Download "Approximations of marginal tail probabilities for a class of smooth functions with applications to Bayesian and conditional inference"

Transcription

1 Biometrika (1991), 78, 4, pp Printed in Great Britain Approximations of marginal tail probabilities for a class of smooth functions with applications to Bayesian and conditional inference BY THOMAS J. DiCICCIO AND MICHAEL A. MARTIN Department of Statistics, Stanford University, Stanford, California 94305, U.S.A. SUMMARY This paper presents an asymptotic approximation of marginal tail probabilities for a real-valued function of a random vector, where the function has continuous gradient that does not vanish at the mode of the joint density of the random vector. This approximation has error O(n~ 2/2 ) and improves upon a related standard normal approximation which has error O(n~^). Derivation involves the application of a tail probability formula given by DiCiccio, Field & Fraser (1990) to an approximation of a marginal density derived by Tierney, Kass & Kadane (1989). The approximation can be applied for Bayesian and conditional inference as well as for approximating sampling distributions, and the accuracy of the approximation is illustrated through several numerical examples related to such applications. In the context of conditional inference, we develop refinements of the standard normal approximation to the distribution of two different signed root likelihood ratio statistics for a component of the natural parameter in exponential families. Some key words: Asymptotic expansion; Conditional likelihood; Confidence limit; Exponential family; Exponential regression model; Marginal posterior distribution function; Natural parameter; Normal approximation; Signed root likelihood ratio statistic. 1. INTRODUCTION Consider a continuous random vector X = (X 1,..., X p ) having probability density function of the form f x (x) = cb(x)exp{l(x)}, x = {x\...,x p ). (1) Suppose that the function / attains its maximum value at x = (x',..., x p ) and that X - x is O p {n~^) as some parameter n, usually sample size, increases indefinitely. For each fixed x, assume that l{x) and its partial derivatives are O(n) and that b(x) is 0(1). Now consider a real-valued variable Y = g(x), where the function g has continuous gradient that is nonzero at x. In this paper, we present an accurate approximation for marginal tail probabilities of Y that is easy to compute and does not involve numerical integration in high dimensions. To calculate an initial approximation of the marginal tail probability pr(y= _y), let x = x(y) be the value of x that maximizes l{x) subject to the constraint g(x) = y. Moreover, let y = g{x), so that Y-y is O p (n~^) and x(y) = x. Consider the function r(y) = sgn (y -y)(2[l(x) - l{x(y)}-\)k (2)

2 892 THOMAS J. DICICCIO AND MICHAEL A. MARTIN which is assumed to be monotonically increasing. Approximations to the distribution function of Y can be based on normal approximations to the distribution of R = r( Y). In particular, provided y y is O(n~*), PT(Y^y) = pr(r^r) = <t>(r) + O(n-i), (3) where r = r(y) and O is the standard normal distribution function. The standard normal approximation to the distribution of R can be improved. Additional notation is necessary to formulate a more accurate approximation. Let /,(JC) = dl(x)/dx', l ij (x) = d 2 l{x)/dx i dx J, g l {x) = dg(x)/dx i and g lj (x) = d 2 g(x)/dx i dx J, etc. (i,j=l,...,p). Put J 0 (y) = -l,j{ where k is any index such that g*{x(_y)} does not vanish. Such an index k always exists by virtue of the assumptions about g. Define J(y) = {Jij(y)}, and J(y)~ l = {J' J (y)}- Thus J(y) is a pxp matrix and J(y) = { l ij {x)}. Finally, let Q(y) = J u (y)g,{x(y)}gj{x(y)}, D{y) = {QiyWiyMAyW*- In this expression for Q(y) and in subsequent expressions, the summation convention is used. The improved approximation is where r = r(y), <f> is the standard normal probability density function and j is any index such that gj{x(y)} is nonzero. For the univariate casep = 1, when g is the identity function, approximation (4) reduces to ' U)/ "" il "'"'" 1 + O(n- 3 ' 2 ), (5) where r = r(x) = sgn (x - x)[2{/(jc) - /(*)}]* and / <k) (x) = d k l(x)/dx k (k= 1, 2). Formula (4) is especially useful in Bayesian situations where it provides accurate approximations to marginal posterior distribution functions. For such applications, it is convenient to have / the log likelihood function and b the prior density. An example of this type is considered in 3. In 2, we present a derivation of (4); we apply a tail probability approximation given by DiCiccio et al. (1990) to the approximation of a marginal density developed by Tierney et al. (1989). Section 3 contains several numerical examples which illustrate the accuracy of approximation (4) in a variety of situations. Applications of (4) to exponential families are discussed in 4. In particular, approximations to marginal tail probabilities for scalar functions of the sufficient statistic are given. Approximate conditional inference for the natural parameters of the family is also examined. 2. DERIVATION OF TAIL PROBABILITY APPROXIMATIONS DiCiccio et al. (1990) have considered tail probability approximations for (1) in the univariate case p = 1 with b(x) = 1. They showed that, provided x-x is O(n~^), pr (X^x) = *(r) + <M')[;+ iz^p] + O(n' 3/2 ), (6)

3 Approximations of marginal tail probabilities 893 where r is defined as for (5). This approximation applies even if the density of X is not completely known except for a normalizing constant. In particular, it is valid if f x (x) = c exp{/(jc)}{l + O(n~ 3/2 )} when x-x is O(n~ J ), where c is a normalizing constant such that c exp {/(x)} integrates to 1 + O(n~ 3/2 ). Approximation (4) can be derived by applying (6) to an approximation of an appropriate marginal density. Tierney et al. (1989, 1991) have given an asymptotic approximation to the marginal density of Y = g(x) for p s* 1. The renormalized version of their approximation to the true density f Y (y) is f* Y (y) = cd(y)[b{x(y)}/b(x)] exp [l{x(y)} - l(x)], (7) where c is a normalizing constant such that /t-(_y) integrates to l + O(n~ 3/2 ). Provided y y is Oin'*), this renormalized approximation has relative error of order n~ y2 ; that is,.my) =/*(.y){l + O(n~ y2 )}. Leonard, Hsu & Tsui (1989) also discuss the saddlepoint accuracy of(7). Now consider a change of variable W = h(y), where the function h(y) is chosen to satisfy dh(y)/dy = n~^d(y)b{x(y)}. Then h(y) is a monotonically increasing function. The Tierney et al. approximation to the density of W is f%,(w)ccexp{t(w)}, where Hh(y)} = l{x(y)}. Note that T(w) is maximized at w = h(y) and that T(w) = l(x). Application of (6) to this approximate marginal density of W yields where O{n~ y2 ), (8) w = h(y), r = sgn(h--w)[2{/"(vv)-r(m')}] i, T Ck \w) = d k T(w)/dw" (Ik = 1,2). Explicit knowledge of the function h(y) is not required to calculate approximation (8). Since w = h(y) in (8), it follows that r = sgn {h(y) -h{y)}{2[ T{h(y)} - T{h(y)}])^ = sgn (y-y)(2[l(x) - /{Jc(y)}])*, (9) which coincides with (2). To find an expression for T (x) (w)= T il) {h(y)} in (8), note that differentiation of T{h(y)} = l{x(y)} with respect to y yields T (n {h(y)}h w (y) = l,{x(y)}x' {l) (y), (10) where h il \y) = dh(y)/dy and x' (l) (y) = dx'(y)/dy (i = 1,...,p). A simple formula for the right-hand side of (10) is available by a Lagrange multiplier argument for maximizing l(x) subject to the constraint g(x) = y. By using such an argument, it may be shown that gi{x(y)}x[ u {y) = land l l {x(y)} = ^P^g l {x(y)} (i = l,...p), Sj\x\y)) for any index j having gj{x(y)} + 0. At least one such index always exists by assumption. Hence = nuj{x(y)}/[gj{x{y)}d(y)b{x(y)}l (12) where j is any index for which gj{x(y)} is nonzero. To find an expression for -l (2) (w) = -T i2) {h(y)} in (8), note that differentiation of (10) with respect to y yields w = -J ij (y)xuy)x{ l) (y). (13)

4 894 THOMAS J. DICICCIO AND MICHAEL A. MARTIN It follows from differentiation of (11) with respect to y that xu9) = J ii (y)gj{ )/Q{y) (/=l,...,p). (14) Substitution of (14) into (13) produces {-T w (w)}^nhhx)}-\ (15) Finally, by substitution of (9), (12) and (15) into (8), we obtain approximation (4). A desirable feature of (4) is its equivariance under invertible transformations of Y. For example, if Z = y(x) is related to y = g(x) by Y = (Z), where f is a real-valued, differentiate and increasing transformation, then the approximation to pr (Z^z) obtained by applying (4) to Z = y(x) directly coincides with the approximation to pr{y^ (z)} obtained by applying (4) to Y. Similarly, if is decreasing, then the approximation to pr(z^ z) coincides with that to 1-pr{y^f(z)}. Note, however, that (4) is not invariant under nonlinear transformations of the joint density (1). We discuss this issue further in 3. In the case where b{x) = 1 and g is a coordinate function, say g(x) = x 1, approximation (4) reduces to where r=r(x l ), D(x') ={/ II (x I ) 7(x 1 ) / J(jc I ) }- 1 and the components of J(y)i have the particularly simple form Jy(x') = l tj {x(x 1 )}. This formula was derived by DiCiccio et al. (1990). The conditions imposed on g place moderate limitations on the types of statistics for which tail probabilities may be approximated using (4). Leonard et al. (1989) present examples in which the Tierney et al. (1989) approximation for marginal densities produces inaccurate results. Approximation (4) also performs poorly for these instances. The examples of Leonard et al. focus on situations where the function g is many-to-one. For instance, if Y = g(x) = (X') (X P ) 2 and x = (0,...,0) then approximation (4) cannot be applied. On the other hand, if x is close to zero, although approximation (4) is formally applicable, it can be expected to yield poor results in small to moderately sized samples. An alternative approximation to (4) can be derived by applying (6) directly to the density (7) written as/t-(}0 oc exp{/*(>')}, where In general, a closed-form expression for x(y) is unavailable, and hence D(y) cannot be written explicitly. Since l*(y) depends on D(y), the maximizing point and derivatives of l*{y) required for application of (6) can be difficult to calculate. Numerical methods are available, however, that facilitate this application of (6). One drawback of (4) is that it can yield approximations which exceed one or are negative. Such problems can be avoided by using an alternative approximation. It is easily shown by Taylor expansion of the right-hand side of (4) that where c=c(y) This alternative approximation was suggested to us by Luke Tierney and a referee.

5 Approximations of marginal tail probabilities 895 Approximation (4) may be interpreted both algebraically and numerically. In certain circumstances, the approximation produces a convenient, closed-form expression estimating pr (y^^); see 3-2. However, in many cases the approximation does not result in a closed-form expression, and it is then most effectively viewed as a useful computational tool. 3. APPLICATIONS 3 1. Exponential regression model Feigl & Zelen (1965) investigated the relationship between survival time for leukaemia patients and a concomitant variable, patient white blood cell count. The sample they used consisted of 17 patients with acute myelogenous leukaemia. We study an exponential regression model for survival time T, which has density function conditional on x, the base 10 logarithm of white blood cell count, f(t\x) = 0" 1 exp (~t/6 x ), for t>0, where 6 X = exp (Po + Pix). Inference about 6 X for a specified value of x is important. The density function of Y = log T, conditional on x, is exp{j>-0 o -j3 1 x-expo'-0o-0 1 x)} (-oo<_y<oo); alternatively, we may write Y r ~/3 0 + )3 1 x +, where e has an extreme value distribution with density exg (z - e z ) for -oo<z<oo (Lawless, 1982). Let /3 = (@o, P\) be the maximum likelihood estimator of/s = (/3 0, P\)- When censoring is absent, the residuals A, = Y^-/3 0 -)3,x, (i = 1,..., n) are ancillary statistics. Let A = (A t,..., A n ) and (Z o, Z,) = (/3 -)9). Inference about )3 and about 0 x for some specified x, say XQ, can be based on the conditional density of Z = (Z o, Z,) given A. This conditional density is f z]a (z 0, z, a)ocexp{/(z 0, z,)}, where {a / -Zo-z 1 x / -exp(a i -z o -z I x I )}; (16) see Lawless (1982, p. 290), who develops exact conditional procedures based on /Z\A(ZO, ZI O)- We focus on inference for 0^ by considering the pivotal quantity Z 2 = Z 0 + Z 1 x 0 = \og6 Xo -\og6 Xo. Lawless derives an exact formula for pr(z 2 = _y A = a). Unfortunately, Lawless's technique does not extend easily beyond the case of a single regressor variable, as it requires numerical integration of the density of Z given A. As an alternative to the exact conditional procedure, we could use the large-sample normal approximation fi ~ N(f}, /"'), where / is the observed information matrix. Then Z 2 has an approximate normal distribution on which tests and confidence intervals for 8^ can be based. Table 1 contains exact and approximate values of pr (Z 2 = y \ A = a) for various values of y in the case when x 0 = Exact tail probabilities were computed using equation (6.3.14) of Lawless (1982, p. 292) by numerical integration. For approximations (3) and Table 1. Approximations to tail probabilities of Z 2 y Exact * 30101* * Approximation (3) * 40157* * Approximation (4) * * * Large sample * * * * Denotes tail probability taken to the right Table entries are percentages.

6 896 THOMAS J. DICICCIO AND MICHAEL A. MARTIN (4) we chose b in equation (1) to be 1 and / to be given by (16). For all values of y considered, approximation (4) gives results very close to the exact tail probabilities. Approximation (3) and the large-sample normal approximation give relatively inaccurate estimates. We now compare 95% confidence intervals for 6^ when x 0 = , obtained by the methods discussed above. Upper and lower 2-5% percentage points for the distribution of Z 2 and 95% confidence intervals for log 6^ and 6^ for each of the techniques are presented in Table 2. The intervals corresponding to approximations (3) and (4) were computed by numerical inversion of those formulae. The intervals obtained using approximation (4) are very close to the exact intervals, while those obtained using (3) are less accurate but still reasonable. The intervals derived from the large-sample normal approximation are quite inaccurate in comparison with the other intervals, which suggests that larger samples might be needed to obtain high accuracy with this method. Table 2. 95% confidence intervals for mean survival time of patients with white blood cell count of Lower Upper 95% c.i. for log 0 x 95% c.i. for 6 X Exact (2-6879,4-0799) ( , ) Approximation (3) (2-6514,4-0295) ( , ) Approximation (4) (2-6883,4-0795) ( , ) Large sample (2-5786,3-9427) ( , ) Tierney et al. (1989) consider Bayesian inference for this model based on an improper uniform prior density on 0, = log /3 0 and 6 2 = /3,. Approximation (4) could be applied to produce approximate marginal posterior tail probabilities in this Bayesian context by choosing b, the prior density, equal to 1 and / to the log likelihood. Approximate posterior quantiles for linear functions of ft obtained in this way coincide with the approximate conditional confidence limits for those parameters obtained using (4) with our previous choice of b and /. This correspondence is natural because of the connection between Bayesian and conditional inference for location models under the assumption of uniform priors. Tierney et al. consider in particular construction of an approximate marginal posterior density for the two year survival probability of patients at a white blood cell count of This probability is an increasing function of 6^. Consequently, use of (4) to derive approximate posterior quantiles or confidence limits for this probability produces the same results as transforming in the natural way the approximate quantiles or limits derived for 6^ Noncentral t distribution Let X,,..., X n be independent and identically distributed observations from a normal N(fi, a 2 ) population, and let X = n~ x 1 X, and S 2 = (n - I)" 1 (X, -X) 2 denote sample mean and sample variance respectively. Given a value x 0, the quantity T'= n\x o -X)/ S has a noncentral t distribution with n 1 degrees of freedom and noncentrality parameter n\x 0 fi)/a. Computation of tail probabilities for the noncentral t distribution is difficult since it requires numerical integration of the noncentral t density, which is typically written in integral or infinite series form. In contrast, computation of approximation (4) to tail probabilities is relatively easy.

7 Approximations of marginal tail probabilities 897 Without loss of generality, suppose that the normal population has zero mean and unit variance. For each of the four choices of variables (U, V) = (X, S), (X, log 5), (X, S~ l ) and (X, S 2 ), we compute approximation (4) taking b in equation (1) to be a constant and / to be the logarithm of the joint density of U and V. Put Y = (x 0 - X)/S = n~^t'. We compute tail probability approximations for Y based on equation (4) for each of the four choices of variables. For the variables (X, S), we have for n > 3, where 1).VI ( ^ (17) nu [2(n2) + nyxv) nyx o v) J For the variables (X, log 5) the approximation is for n > 1, where pr(y^y)^<t>(r) + 4,(r)\-+(nueT i \- ^["'^ Vl (18) \_r [2(n l) + ny - nyu exp (-u)j J r = sgn (y-x X) ){nx o u-2{n - 1)6}*, u = x o -ye D, v = log {\{n ny 2 y l [nyx 0 + {(nyx 0 ) 2 + 4(n - l)(n n/)}*]). For the variables (X, S" 1 ), we have for n ^ 2, where = sgn /«1\*1 f -x 0^ I n (n 1) nu +3n^ -2nxz o u}(n 1 + " Finally, for the variables (X, S 2 ) we have the approximation for n ^ 4, where -x o (^ -J J (n-3)1 log (^ YJ-log u + nx o u I, u = x o -j't5 1, ), (19) (20) u = Kn ny 2 y 2 [nyx 0 + {(nyx 0 ) 2 + 4{n-3)(n n The results of a numerical study of tail probabilities for Y with n = 5, x 0 = 0, corresponding to a central t distribution, and n = 5, x o = , corresponding to a noncentral t distribution, are given in Table 3. Approximations (17)-(20) all appear to perform reasonably well; in each case substantial improvement is gained over approximation (3).

8 898 THOMAS J. DICICCIO AND MICHAEL A. MARTIN Table 3. Tail probability approximations related to the noncentral t distribution y Exact (X, (3) 5) (18) (X, (3) logs) (19) (X, (3) 1/S) (20) (X, (3) 5 2 ) (21) Table entries are percentages. Values for y 3*6-5 are for right-hand tail The choice of variables seems important for the accuracy of the approximation. Approximations (17) and (18) involving variables (X, S) and (X, log 5) are clearly best, while approximation (20) involving (X, S 2 ) is the most inaccurate. An interesting point is that for x 0 = 0, approximation (17) is most accurate, while for the case *o = , approximation (18) is preferable. Tierney et al. (1989) consider the problem of estimating the proportion of a normal N(fi,a 2 )_ population that falls below a point x 0. They study the estimator P = &{( x o~x)/s} of this quantity. Since <&(.) is a monotone function, approximation (4) to the tail probability pr(p^p) for each of the four choices of variables considered above can be obtained by replacing y by $>~ l (p) in formulae (17)-(20). 4. APPLICATIONS TO EXPONENTIAL FAMILIES 4-1. Marginal tail probabilities for a function of the sufficient statistic Suppose T,,..., T n is a sample of size n from the exponential family having density Let /3'(T) = dp(r)/dt, and /3 y (r) = d^i^/drjtj (i,j = 1,..., p). Put B(T) = {p lj (r)} and B(T) ' = {/3, 7 (T)} SO that B(T) is a pxp matrix and \B(T)\ is of order O(l). The density of the sufficient statistic X = n~ l 1 T, satisfies f x (x; r) = c B(f) "» exp [n{p(f)-p(r) + x'(f,-t,)}]{1 + O(n~ 3/2 )}, (21) provided x is O(n~^), where T is given by x' = )3'(T) (I = 1,..., p) and c is a normalizing constant such that the approximation on the right-hand side of (21) integrates to 1 + O(n~ 3/2 ). Here, f is the maximum likelihood estimator of T based on the observation X = x Formula (21) was derived by Barndorff-Nielsen & Cox (1979); see also Daniels (1958), Durbin (1980) and Reid (1988).

9 Approximations of marginal tail probabilities 899 Marginal tail probabilities for a real-valued function Y = g(x) can be approximated by applying (4) to (21). It is convenient to make the choice Note that l(x) attains its maximum value at the point x = (x\..., x p ) given by x' = -/3'(T). For fixed y, suppose x = x(y) maximizes l(x) subject to the constraint g(x) =y, and let r = f(y) satisfy X' = -B'(T) (i = 1,...,/?). Observe that r(y) = T for y = g(x). Then approximation (4) to the marginal tail probability pr(y* y) is where k is any index for which g k (x) does not vanish and { r k - r k (22) For the case of the coordinate function g(x) = x p, it is convenient to partition T into (A, iff), where_i = T p, and A = (A,,... z A p _,) has A a = r a (a = 1,...,p- 1). Then T(X P ) = (A, ip), with (^ given by x" = -/3 P (A, (/»), and (22) reduces to where pr (X" ^ x") = O(r) + <f>(r)\ l -+ n^ ^""^ f )} "*1 + O(n'^), (23) 4-2. Conditional inference for a component of the natural parameter Now suppose r = (A, i^) and tp is the parameter of interest, with A being a nuisance parameter. Unfortunately, the marginal distribution of X p is not particularly useful for inference about tp since that distribution depends on the nuisance parameter; indeed, approximation (23) involves A. However, the conditional distribution of X p given Y = (y 1,..., Y"~ 1 ) = (X\...,X P ' 1 ), depends on T only through <p. Bamdorff-Nielsen & Cox (1979) have shown that the conditional density of X p given Y satisfies x{l + O(n- 3/2 )}, (24) provided x"-x p is O(n'^), where (A, 4>) = f, where A* satisfies y" = -B"iX*, *P) \a = 1,..., p - 1), Bi(A, tp) is the (p -1) x (p - 1) submatrix of B(A, i/>) corresponding to A, and where c is a normalizing constant such that the approximation on the right-hand side of (24) integrates to 1 + O(n~ 3/2 ). Here A* is the constrained maximum likelihood estimator of A under the fixed value of ip having observed X = x, and x p is defined below.

10 900 THOMAS J. DICICCIO AND MICHAEL A. MARTIN Approximations to conditional tail probabilities for X p can be obtained by applying (5) to (24). In this case it is natural to choose Since y" = x a (a - 1,...,p-\) are taken as fixed and x' = -B'(k, </)) (i = 1,...,p), it is possible to regard (A, tp) as a function of x p alone. Straightforward calculations give / (1) (x') = «(< -*) and -P\x") = nb pp (k, $), where {B o (\, *)} = B(k, *)~\ Thus, /(*') is maximized at x p = -B p (k*, tp), and since B(k, <p) is positive definite for all (A, if/), it follows that x p is a decreasing function of \p. Formula (5) yields the approximation where pr (X p^x»\y = y) = *(r) + *(,)[! + -* 1 ff' (.f'^'vl + O{n~^), (25) r = sgn(x p -xn[2n{pa*,+)-f}aj)-y a a a -Xt)-x''(<i,-<l,)}f. (26) Having observed X = x, an exact upper 1 a conditional confidence limit for the parameter of interest can be computed as the value of ip such that pr (X p = x p \ Y = y) = I a. Similarly, an approximate conditional limit can be computed as the value of ip for which the right-hand side of (25) equals I-a. This approximate limit differs from the exact limit by terms of order O p (n~ 2 ), and the corresponding conditional confidence interval has coverage error of order O(n~ 3/2 ). In contrast, approximation (3) yields pr (X p ^ x p \ Y = y) 3>(r). The approximate confidence limit calculated as the value of ip for which <&(r) = 1 - a differs from the exact limit by terms of order O p (n~ l ), and the resulting interval has coverage error of order O(n^). This approach to constructing approximate confidence limits is closely related to a method given by Barndorff-Nielsen (1986). Since sgn (x p -jc /> ) = sgn (ip $), it follows that r defined at (26) is simply the signed root of the likelihood ratio statistic for ip. Barndorff-Nielsen shows that the marginal distribution of r - r~ l log K is standard normal to error of order O(n~ 3/2 ), where K is a variable admitting the expansion K = 1 + <?,(A*)r+ Q 2 (A*)r 2 + O p (n- 3/2 ), and Q, = CMA*) and Q 2 = CM^*) a re O p (n~ i ) and O p (n~ l ), respectively. An approximate upper 1 - a confidence limit for the parameter of interest can be calculated as the value of ip for which <J>(r-r~' log K) = 1 -a. Note that, to error of order O p (n~ 3/2 ), <t>(r-r l log K) = 4>{r-Q i -i(q 2 For the present problem, it follows from Barndorff-Nielsen's formula (3.11) that K j l_f B 1 (A^L ) ] i r~" ip-ip 1 \B(k,ip)\ J ' The approximate upper I-a limits obtained through <&(r-r~ l log K) and (25) therefore differ by terms of order O p (n~ 2 ), and the corresponding conditional confidence intervals both have coverage error of order O(n~ 3/2 ). The primary advantage in considering (25) is that it shows these limits to have conditional validity. Approximation (25) is also given by Skovgaard (1987) and is discussed by Davison (1988).

11 Approximations of marginal tail probabilities 901 Using (24), Barndorff-Nielsen & Cox (1979, formula (6.1)) have approximated the conditional log likelihood function for ip based on the observation X = x by T{*;x p \y)=$log\b l (k*,<l,)\-n{p{%*,t) + y %* + xw. (27) Let ip be the value of ip for which (27) is maximized, and put A = A*. Since y a = x a (a = 1,..., p -1) are taken as fixed, it is possible to regard (A, ip) as a function of x p ajone. The difference between the approximate conditional maximum likelihood estimator ip and the unconditional maximum likelihood estimator ip is of order O p (n~'); see Barndorff-Nielsen & Cox (1979, formula (6.2)). By a variation of the argument given by Barndorff-Nielsen & Cox that leads to their formulae (3.15) and (4.8), it follows that /x' y(x" \y) = c\b(k, iht^a x{l + O(n- 3/2 )}, (28) where c is a normalizing constant such that the approximation on the right-hand side of (28) integrates to l + O(rT 3/2 ). As for (24), approximations to conditional tail probabilities for X p can be obtained by applying (5) to (28). In this case, the choice is convenient. Since l(x") = -T{ip~; x p \y)-nx p tf/, we have l il) (x p ) = n(4>-tp). Hence x p is the value of x p whose corresponding i/f equals tp, and moreover, Formula (5) yields the approximation where 3 PP (A7, Since the -5log B,(A, \p)\ term in l(x p ) is O(l), it can be shown that (30) '{b{x»)y *-' 1 " Consequently, an alternative approximation to (29) is (,/,-</,) In an as yet unpublished technical report, D. A. S. Fraser and N. Reid have derived approximation (31) using different techniques.

12 902 THOMAS J. DICICCIO AND MICHAEL A. MARTIN Note that r defined at (30) is the signed root of the approximate conditional likelihood ratio statistic for \\i having observed X = x As in the case of (25), an approximate conditional upper 1 - a confidence limit for \fi can be computed as the value of i/> for which the right-hand side of (29) equals I-a. These approximate limits have the same asymptotic properties as those derived from (25), and they improve upon the usual limits derived from the uncorrected standard normal approximation O(r). For more general parametric models, corrections that improve the accuracy of the standard normal approximation to distributions of signed roots of likelihood ratio statistics can be derived from formula (4). Welch & Peers (1963) and Peers (1965) have described how a prior density function for a vector parameter should be chosen so that the posterior quantiles for a component of the vector are approximate confidence limits in the repeated sampling sense, having coverage error of order O{n~ x ). Using such prior densities, modifications to signed roots of likelihood ratio statistics can be derived by applying formula (4) to the joint posterior density of the vector parameter. We develop these corrections in an as yet unpublished paper and relate them to modifications proposed by Barndorff-Nielsen (1990a, b, c). Barndorff-Nielsen's modifications arise from integration of his formula for approximating the conditional density of the maximum likelihood estimator given an ancillary statistic. ACKNOWLEDGEMENT We are grateful to Luke Tierney for helpful discussions. REFERENCES BARNDORFF-NIELSEN, O. E. (1986). Inference on full or partial parameters based on the standardized signed log likelihood ratio. Biometrika 73, BARNDORFF-NIELSEN, O. E. (1990a). Discussion of paper by D. A. Sprott. Can. J. Statist. 18, BARNDORFF-NIELSEN, O. E. (1990b). A note on the standardized signed log likelihood ratio. Scand. J. Statist. 17, BARNDORFF-NIELSEN, O. E. (1990C). Approximate interval probabilities. J. R. Statist. Soc B 52, BARNDORFF-NIELSEN, O. E. & Cox, D. R. (1979). Edgeworth and saddlepoint approximations with statistical applications (with discussion). J. R. Statist. Soc. B 41, DANIELS, H. E. (1958). Discussion of paper by D. R. Cox. J. R. Statist. Soc. B 20, DAVISON, A. C. (1988). Approximate conditional inference in generalized linear models. /. R. Statist. Soc B 50, DICICCIO, T. J., FIELD, C. A. & FRASER, D. A. S. (1990). Approximations of marginal tail probabilities and inference for scalar parameters. Biometrika 77, DURBIN, J. (1980). Approximations for densities of sufficient estimators. Biometrika 67, FEIGL, P. & ZELEN, M. (1965). Estimation of exponential survival probabilities with concomitant information. Biometrics 21, LAWLESS, J. F. (1982). Statistical Models and Methods for Lifetime Data. New York: Wiley. LEONARD, T., HSU, J. S. J. & Tsui, K.-W. (1989). Bayesian marginal inference. J. Am. Statist. Assoc. 84, PEERS, H. W. (1965). On confidence points and Bayesian probability points in the case of several parameters. J. R. Statist Soc B 27, REID, N. (1988). Saddlepoint methods and statistical inference (with discussion). Statist Set 3, SKOVGAARD, I. M. (1987). Saddlepoint expansions for conditional distributions. / AppL Prob. 24, TIERNEY, L., KASS, R. E. & KADANE, J. B. (1989). Approximate marginal densities of nonlinear functions. Biometrika 76, Amendment (1991), 78, WELCH, B. L. & PEERS, H. W. (1963). On formulae for confidence points based on intervals of weighted likelihoods. J. R. Statist Soc B 25, [Received April Revised March 1991]

CONVERTING OBSERVED LIKELIHOOD FUNCTIONS TO TAIL PROBABILITIES. D.A.S. Fraser Mathematics Department York University North York, Ontario M3J 1P3

CONVERTING OBSERVED LIKELIHOOD FUNCTIONS TO TAIL PROBABILITIES. D.A.S. Fraser Mathematics Department York University North York, Ontario M3J 1P3 CONVERTING OBSERVED LIKELIHOOD FUNCTIONS TO TAIL PROBABILITIES D.A.S. Fraser Mathematics Department York University North York, Ontario M3J 1P3 N. Reid Department of Statistics University of Toronto Toronto,

More information

A simple analysis of the exact probability matching prior in the location-scale model

A simple analysis of the exact probability matching prior in the location-scale model A simple analysis of the exact probability matching prior in the location-scale model Thomas J. DiCiccio Department of Social Statistics, Cornell University Todd A. Kuffner Department of Mathematics, Washington

More information

DEFNITIVE TESTING OF AN INTEREST PARAMETER: USING PARAMETER CONTINUITY

DEFNITIVE TESTING OF AN INTEREST PARAMETER: USING PARAMETER CONTINUITY Journal of Statistical Research 200x, Vol. xx, No. xx, pp. xx-xx ISSN 0256-422 X DEFNITIVE TESTING OF AN INTEREST PARAMETER: USING PARAMETER CONTINUITY D. A. S. FRASER Department of Statistical Sciences,

More information

ASSESSING A VECTOR PARAMETER

ASSESSING A VECTOR PARAMETER SUMMARY ASSESSING A VECTOR PARAMETER By D.A.S. Fraser and N. Reid Department of Statistics, University of Toronto St. George Street, Toronto, Canada M5S 3G3 dfraser@utstat.toronto.edu Some key words. Ancillary;

More information

Conditional confidence interval procedures for the location and scale parameters of the Cauchy and logistic distributions

Conditional confidence interval procedures for the location and scale parameters of the Cauchy and logistic distributions Biometrika (92), 9, 2, p. Printed in Great Britain Conditional confidence interval procedures for the location and scale parameters of the Cauchy and logistic distributions BY J. F. LAWLESS* University

More information

Conditional Inference by Estimation of a Marginal Distribution

Conditional Inference by Estimation of a Marginal Distribution Conditional Inference by Estimation of a Marginal Distribution Thomas J. DiCiccio and G. Alastair Young 1 Introduction Conditional inference has been, since the seminal work of Fisher (1934), a fundamental

More information

The formal relationship between analytic and bootstrap approaches to parametric inference

The formal relationship between analytic and bootstrap approaches to parametric inference The formal relationship between analytic and bootstrap approaches to parametric inference T.J. DiCiccio Cornell University, Ithaca, NY 14853, U.S.A. T.A. Kuffner Washington University in St. Louis, St.

More information

Approximate Inference for the Multinomial Logit Model

Approximate Inference for the Multinomial Logit Model Approximate Inference for the Multinomial Logit Model M.Rekkas Abstract Higher order asymptotic theory is used to derive p-values that achieve superior accuracy compared to the p-values obtained from traditional

More information

Stat 5101 Lecture Notes

Stat 5101 Lecture Notes Stat 5101 Lecture Notes Charles J. Geyer Copyright 1998, 1999, 2000, 2001 by Charles J. Geyer May 7, 2001 ii Stat 5101 (Geyer) Course Notes Contents 1 Random Variables and Change of Variables 1 1.1 Random

More information

A NOTE ON LIKELIHOOD ASYMPTOTICS IN NORMAL LINEAR REGRESSION

A NOTE ON LIKELIHOOD ASYMPTOTICS IN NORMAL LINEAR REGRESSION Ann. Inst. Statist. Math. Vol. 55, No. 1, 187-195 (2003) Q2003 The Institute of Statistical Mathematics A NOTE ON LIKELIHOOD ASYMPTOTICS IN NORMAL LINEAR REGRESSION N. SARTORI Department of Statistics,

More information

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands

Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Asymptotic Multivariate Kriging Using Estimated Parameters with Bayesian Prediction Methods for Non-linear Predictands Elizabeth C. Mannshardt-Shamseldin Advisor: Richard L. Smith Duke University Department

More information

FULL LIKELIHOOD INFERENCES IN THE COX MODEL

FULL LIKELIHOOD INFERENCES IN THE COX MODEL October 20, 2007 FULL LIKELIHOOD INFERENCES IN THE COX MODEL BY JIAN-JIAN REN 1 AND MAI ZHOU 2 University of Central Florida and University of Kentucky Abstract We use the empirical likelihood approach

More information

Bootstrap and Parametric Inference: Successes and Challenges

Bootstrap and Parametric Inference: Successes and Challenges Bootstrap and Parametric Inference: Successes and Challenges G. Alastair Young Department of Mathematics Imperial College London Newton Institute, January 2008 Overview Overview Review key aspects of frequentist

More information

Chapter 11. Taylor Series. Josef Leydold Mathematical Methods WS 2018/19 11 Taylor Series 1 / 27

Chapter 11. Taylor Series. Josef Leydold Mathematical Methods WS 2018/19 11 Taylor Series 1 / 27 Chapter 11 Taylor Series Josef Leydold Mathematical Methods WS 2018/19 11 Taylor Series 1 / 27 First-Order Approximation We want to approximate function f by some simple function. Best possible approximation

More information

Australian & New Zealand Journal of Statistics

Australian & New Zealand Journal of Statistics Australian & New Zealand Journal of Statistics Aust.N.Z.J.Stat.51(2), 2009, 115 126 doi: 10.1111/j.1467-842X.2009.00548.x ROUTES TO HIGHER-ORDER ACCURACY IN PARAMETRIC INFERENCE G. ALASTAIR YOUNG 1 Imperial

More information

Modern Likelihood-Frequentist Inference. Donald A Pierce, OHSU and Ruggero Bellio, Univ of Udine

Modern Likelihood-Frequentist Inference. Donald A Pierce, OHSU and Ruggero Bellio, Univ of Udine Modern Likelihood-Frequentist Inference Donald A Pierce, OHSU and Ruggero Bellio, Univ of Udine Shortly before 1980, important developments in frequency theory of inference were in the air. Strictly, this

More information

Noninformative Priors for the Ratio of the Scale Parameters in the Inverted Exponential Distributions

Noninformative Priors for the Ratio of the Scale Parameters in the Inverted Exponential Distributions Communications for Statistical Applications and Methods 03, Vol. 0, No. 5, 387 394 DOI: http://dx.doi.org/0.535/csam.03.0.5.387 Noninformative Priors for the Ratio of the Scale Parameters in the Inverted

More information

Modern Likelihood-Frequentist Inference. Summary

Modern Likelihood-Frequentist Inference. Summary Modern Likelihood-Frequentist Inference Donald A. Pierce Oregon Health and Sciences University Portland, Oregon U.S.A Ruggero Bellio University of Udine Udine, Italy Summary We offer an exposition of modern

More information

Today. Probability and Statistics. Linear Algebra. Calculus. Naïve Bayes Classification. Matrix Multiplication Matrix Inversion

Today. Probability and Statistics. Linear Algebra. Calculus. Naïve Bayes Classification. Matrix Multiplication Matrix Inversion Today Probability and Statistics Naïve Bayes Classification Linear Algebra Matrix Multiplication Matrix Inversion Calculus Vector Calculus Optimization Lagrange Multipliers 1 Classical Artificial Intelligence

More information

Parametric Evaluation of Lifetime Data

Parametric Evaluation of Lifetime Data IPN Progress Report 42-155 November 15, 2003 Parametric Evaluation of Lifetime Data J. Shell 1 The proposed large array of small antennas for the DSN requires very reliable systems. Reliability can be

More information

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data

Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone Missing Data Journal of Multivariate Analysis 78, 6282 (2001) doi:10.1006jmva.2000.1939, available online at http:www.idealibrary.com on Inferences on a Normal Covariance Matrix and Generalized Variance with Monotone

More information

Measuring nuisance parameter effects in Bayesian inference

Measuring nuisance parameter effects in Bayesian inference Measuring nuisance parameter effects in Bayesian inference Alastair Young Imperial College London WHOA-PSI-2017 1 / 31 Acknowledgements: Tom DiCiccio, Cornell University; Daniel Garcia Rasines, Imperial

More information

A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS

A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS Statistica Sinica 20 2010, 365-378 A PRACTICAL WAY FOR ESTIMATING TAIL DEPENDENCE FUNCTIONS Liang Peng Georgia Institute of Technology Abstract: Estimating tail dependence functions is important for applications

More information

Remarks on Improper Ignorance Priors

Remarks on Improper Ignorance Priors As a limit of proper priors Remarks on Improper Ignorance Priors Two caveats relating to computations with improper priors, based on their relationship with finitely-additive, but not countably-additive

More information

A correlation coefficient for circular data

A correlation coefficient for circular data BiomelriL-a (1983). 70. 2, pp. 327-32 327 Prinltd in Great Britain A correlation coefficient for circular data BY N. I. FISHER CSIRO Division of Mathematics and Statistics, Lindfield, N.S.W., Australia

More information

The Surprising Conditional Adventures of the Bootstrap

The Surprising Conditional Adventures of the Bootstrap The Surprising Conditional Adventures of the Bootstrap G. Alastair Young Department of Mathematics Imperial College London Inaugural Lecture, 13 March 2006 Acknowledgements Early influences: Eric Renshaw,

More information

Staicu, A-M., & Reid, N. (2007). On the uniqueness of probability matching priors.

Staicu, A-M., & Reid, N. (2007). On the uniqueness of probability matching priors. Staicu, A-M., & Reid, N. (2007). On the uniqueness of probability matching priors. Early version, also known as pre-print Link to publication record in Explore Bristol Research PDF-document University

More information

A Very Brief Summary of Statistical Inference, and Examples

A Very Brief Summary of Statistical Inference, and Examples A Very Brief Summary of Statistical Inference, and Examples Trinity Term 2008 Prof. Gesine Reinert 1 Data x = x 1, x 2,..., x n, realisations of random variables X 1, X 2,..., X n with distribution (model)

More information

11 Survival Analysis and Empirical Likelihood

11 Survival Analysis and Empirical Likelihood 11 Survival Analysis and Empirical Likelihood The first paper of empirical likelihood is actually about confidence intervals with the Kaplan-Meier estimator (Thomas and Grunkmeier 1979), i.e. deals with

More information

Applied Asymptotics Case studies in higher order inference

Applied Asymptotics Case studies in higher order inference Applied Asymptotics Case studies in higher order inference Nancy Reid May 18, 2006 A.C. Davison, A. R. Brazzale, A. M. Staicu Introduction likelihood-based inference in parametric models higher order approximations

More information

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition

NONLINEAR CLASSIFICATION AND REGRESSION. J. Elder CSE 4404/5327 Introduction to Machine Learning and Pattern Recognition NONLINEAR CLASSIFICATION AND REGRESSION Nonlinear Classification and Regression: Outline 2 Multi-Layer Perceptrons The Back-Propagation Learning Algorithm Generalized Linear Models Radial Basis Function

More information

Default priors and model parametrization

Default priors and model parametrization 1 / 16 Default priors and model parametrization Nancy Reid O-Bayes09, June 6, 2009 Don Fraser, Elisabeta Marras, Grace Yun-Yi 2 / 16 Well-calibrated priors model f (y; θ), F(y; θ); log-likelihood l(θ)

More information

University of California, Berkeley

University of California, Berkeley University of California, Berkeley U.C. Berkeley Division of Biostatistics Working Paper Series Year 24 Paper 153 A Note on Empirical Likelihood Inference of Residual Life Regression Ying Qing Chen Yichuan

More information

2 Functions of random variables

2 Functions of random variables 2 Functions of random variables A basic statistical model for sample data is a collection of random variables X 1,..., X n. The data are summarised in terms of certain sample statistics, calculated as

More information

Inversion Base Height. Daggot Pressure Gradient Visibility (miles)

Inversion Base Height. Daggot Pressure Gradient Visibility (miles) Stanford University June 2, 1998 Bayesian Backtting: 1 Bayesian Backtting Trevor Hastie Stanford University Rob Tibshirani University of Toronto Email: trevor@stat.stanford.edu Ftp: stat.stanford.edu:

More information

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) =

σ(a) = a N (x; 0, 1 2 ) dx. σ(a) = Φ(a) = Until now we have always worked with likelihoods and prior distributions that were conjugate to each other, allowing the computation of the posterior distribution to be done in closed form. Unfortunately,

More information

LECTURE NOTES 57. Lecture 9

LECTURE NOTES 57. Lecture 9 LECTURE NOTES 57 Lecture 9 17. Hypothesis testing A special type of decision problem is hypothesis testing. We partition the parameter space into H [ A with H \ A = ;. Wewrite H 2 H A 2 A. A decision problem

More information

Inference on reliability in two-parameter exponential stress strength model

Inference on reliability in two-parameter exponential stress strength model Metrika DOI 10.1007/s00184-006-0074-7 Inference on reliability in two-parameter exponential stress strength model K. Krishnamoorthy Shubhabrata Mukherjee Huizhen Guo Received: 19 January 2005 Springer-Verlag

More information

11. Bootstrap Methods

11. Bootstrap Methods 11. Bootstrap Methods c A. Colin Cameron & Pravin K. Trivedi 2006 These transparencies were prepared in 20043. They can be used as an adjunct to Chapter 11 of our subsequent book Microeconometrics: Methods

More information

Approximating models. Nancy Reid, University of Toronto. Oxford, February 6.

Approximating models. Nancy Reid, University of Toronto. Oxford, February 6. Approximating models Nancy Reid, University of Toronto Oxford, February 6 www.utstat.utoronto.reid/research 1 1. Context Likelihood based inference model f(y; θ), log likelihood function l(θ; y) y = (y

More information

Miscellanea Kernel density estimation and marginalization consistency

Miscellanea Kernel density estimation and marginalization consistency Biometrika (1991), 78, 2, pp. 421-5 Printed in Great Britain Miscellanea Kernel density estimation and marginalization consistency BY MIKE WEST Institute of Statistics and Decision Sciences, Duke University,

More information

STEIN-TYPE IMPROVEMENTS OF CONFIDENCE INTERVALS FOR THE GENERALIZED VARIANCE

STEIN-TYPE IMPROVEMENTS OF CONFIDENCE INTERVALS FOR THE GENERALIZED VARIANCE Ann. Inst. Statist. Math. Vol. 43, No. 2, 369-375 (1991) STEIN-TYPE IMPROVEMENTS OF CONFIDENCE INTERVALS FOR THE GENERALIZED VARIANCE SANAT K. SARKAR Department of Statistics, Temple University, Philadelphia,

More information

The Logit Model: Estimation, Testing and Interpretation

The Logit Model: Estimation, Testing and Interpretation The Logit Model: Estimation, Testing and Interpretation Herman J. Bierens October 25, 2008 1 Introduction to maximum likelihood estimation 1.1 The likelihood function Consider a random sample Y 1,...,

More information

Cheng Soon Ong & Christian Walder. Canberra February June 2018

Cheng Soon Ong & Christian Walder. Canberra February June 2018 Cheng Soon Ong & Christian Walder Research Group and College of Engineering and Computer Science Canberra February June 2018 (Many figures from C. M. Bishop, "Pattern Recognition and ") 1of 305 Part VII

More information

OR MSc Maths Revision Course

OR MSc Maths Revision Course OR MSc Maths Revision Course Tom Byrne School of Mathematics University of Edinburgh t.m.byrne@sms.ed.ac.uk 15 September 2017 General Information Today JCMB Lecture Theatre A, 09:30-12:30 Mathematics revision

More information

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2.

APPENDIX A. Background Mathematics. A.1 Linear Algebra. Vector algebra. Let x denote the n-dimensional column vector with components x 1 x 2. APPENDIX A Background Mathematics A. Linear Algebra A.. Vector algebra Let x denote the n-dimensional column vector with components 0 x x 2 B C @. A x n Definition 6 (scalar product). The scalar product

More information

Likelihood Inference in the Presence of Nuisance Parameters

Likelihood Inference in the Presence of Nuisance Parameters Likelihood Inference in the Presence of Nuance Parameters N Reid, DAS Fraser Department of Stattics, University of Toronto, Toronto Canada M5S 3G3 We describe some recent approaches to likelihood based

More information

Math 423/533: The Main Theoretical Topics

Math 423/533: The Main Theoretical Topics Math 423/533: The Main Theoretical Topics Notation sample size n, data index i number of predictors, p (p = 2 for simple linear regression) y i : response for individual i x i = (x i1,..., x ip ) (1 p)

More information

A note on profile likelihood for exponential tilt mixture models

A note on profile likelihood for exponential tilt mixture models Biometrika (2009), 96, 1,pp. 229 236 C 2009 Biometrika Trust Printed in Great Britain doi: 10.1093/biomet/asn059 Advance Access publication 22 January 2009 A note on profile likelihood for exponential

More information

PATTERN RECOGNITION AND MACHINE LEARNING

PATTERN RECOGNITION AND MACHINE LEARNING PATTERN RECOGNITION AND MACHINE LEARNING Chapter 1. Introduction Shuai Huang April 21, 2014 Outline 1 What is Machine Learning? 2 Curve Fitting 3 Probability Theory 4 Model Selection 5 The curse of dimensionality

More information

BARTLETT IDENTITIES AND LARGE DEVIATIONS IN LIKELIHOOD THEORY 1. By Per Aslak Mykland University of Chicago

BARTLETT IDENTITIES AND LARGE DEVIATIONS IN LIKELIHOOD THEORY 1. By Per Aslak Mykland University of Chicago The Annals of Statistics 1999, Vol. 27, No. 3, 1105 1117 BARTLETT IDENTITIES AND LARGE DEVIATIONS IN LIKELIHOOD THEORY 1 By Per Aslak Mykland University of Chicago The connection between large and small

More information

Efficiency of Profile/Partial Likelihood in the Cox Model

Efficiency of Profile/Partial Likelihood in the Cox Model Efficiency of Profile/Partial Likelihood in the Cox Model Yuichi Hirose School of Mathematics, Statistics and Operations Research, Victoria University of Wellington, New Zealand Summary. This paper shows

More information

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at Biometrika Trust Robust Regression via Discriminant Analysis Author(s): A. C. Atkinson and D. R. Cox Source: Biometrika, Vol. 64, No. 1 (Apr., 1977), pp. 15-19 Published by: Oxford University Press on

More information

Accurate directional inference for vector parameters

Accurate directional inference for vector parameters Accurate directional inference for vector parameters Nancy Reid February 26, 2016 with Don Fraser, Nicola Sartori, Anthony Davison Nancy Reid Accurate directional inference for vector parameters York University

More information

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012 Gaussian Processes Le Song Machine Learning II: Advanced Topics CSE 8803ML, Spring 01 Pictorial view of embedding distribution Transform the entire distribution to expected features Feature space Feature

More information

Sturm-Liouville Theory

Sturm-Liouville Theory More on Ryan C. Trinity University Partial Differential Equations April 19, 2012 Recall: A Sturm-Liouville (S-L) problem consists of A Sturm-Liouville equation on an interval: (p(x)y ) + (q(x) + λr(x))y

More information

University of Toronto

University of Toronto A Limit Result for the Prior Predictive by Michael Evans Department of Statistics University of Toronto and Gun Ho Jang Department of Statistics University of Toronto Technical Report No. 1004 April 15,

More information

Bayesian and frequentist inference

Bayesian and frequentist inference Bayesian and frequentist inference Nancy Reid March 26, 2007 Don Fraser, Ana-Maria Staicu Overview Methods of inference Asymptotic theory Approximate posteriors matching priors Examples Logistic regression

More information

Likelihood Inference in the Presence of Nuisance Parameters

Likelihood Inference in the Presence of Nuisance Parameters PHYSTAT2003, SLAC, September 8-11, 2003 1 Likelihood Inference in the Presence of Nuance Parameters N. Reid, D.A.S. Fraser Department of Stattics, University of Toronto, Toronto Canada M5S 3G3 We describe

More information

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011

Lecture 2: Linear Models. Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 Lecture 2: Linear Models Bruce Walsh lecture notes Seattle SISG -Mixed Model Course version 23 June 2011 1 Quick Review of the Major Points The general linear model can be written as y = X! + e y = vector

More information

STAT 730 Chapter 4: Estimation

STAT 730 Chapter 4: Estimation STAT 730 Chapter 4: Estimation Timothy Hanson Department of Statistics, University of South Carolina Stat 730: Multivariate Analysis 1 / 23 The likelihood We have iid data, at least initially. Each datum

More information

Testing the homogeneity of variances in a two-way classification

Testing the homogeneity of variances in a two-way classification Biomelrika (1982), 69, 2, pp. 411-6 411 Printed in Ortal Britain Testing the homogeneity of variances in a two-way classification BY G. K. SHUKLA Department of Mathematics, Indian Institute of Technology,

More information

PRINCIPLES OF STATISTICAL INFERENCE

PRINCIPLES OF STATISTICAL INFERENCE Advanced Series on Statistical Science & Applied Probability PRINCIPLES OF STATISTICAL INFERENCE from a Neo-Fisherian Perspective Luigi Pace Department of Statistics University ofudine, Italy Alessandra

More information

Accurate directional inference for vector parameters

Accurate directional inference for vector parameters Accurate directional inference for vector parameters Nancy Reid October 28, 2016 with Don Fraser, Nicola Sartori, Anthony Davison Parametric models and likelihood model f (y; θ), θ R p data y = (y 1,...,

More information

The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series

The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series The Restricted Likelihood Ratio Test at the Boundary in Autoregressive Series Willa W. Chen Rohit S. Deo July 6, 009 Abstract. The restricted likelihood ratio test, RLRT, for the autoregressive coefficient

More information

Sufficiency and conditionality

Sufficiency and conditionality Biometrika (1975), 62, 2, p. 251 251 Printed in Great Britain Sufficiency and conditionality BY JOHN D. KALBFLEISCH Department of Statistics, University of Waterloo, Ontario SUMMARY Ancillary statistics

More information

Reliability of Coherent Systems with Dependent Component Lifetimes

Reliability of Coherent Systems with Dependent Component Lifetimes Reliability of Coherent Systems with Dependent Component Lifetimes M. Burkschat Abstract In reliability theory, coherent systems represent a classical framework for describing the structure of technical

More information

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality.

5.1 Consistency of least squares estimates. We begin with a few consistency results that stand on their own and do not depend on normality. 88 Chapter 5 Distribution Theory In this chapter, we summarize the distributions related to the normal distribution that occur in linear models. Before turning to this general problem that assumes normal

More information

Distribution Theory. Comparison Between Two Quantiles: The Normal and Exponential Cases

Distribution Theory. Comparison Between Two Quantiles: The Normal and Exponential Cases Communications in Statistics Simulation and Computation, 34: 43 5, 005 Copyright Taylor & Francis, Inc. ISSN: 0361-0918 print/153-4141 online DOI: 10.1081/SAC-00055639 Distribution Theory Comparison Between

More information

Sample size calculations for logistic and Poisson regression models

Sample size calculations for logistic and Poisson regression models Biometrika (2), 88, 4, pp. 93 99 2 Biometrika Trust Printed in Great Britain Sample size calculations for logistic and Poisson regression models BY GWOWEN SHIEH Department of Management Science, National

More information

GOLDEN SEQUENCES OF MATRICES WITH APPLICATIONS TO FIBONACCI ALGEBRA

GOLDEN SEQUENCES OF MATRICES WITH APPLICATIONS TO FIBONACCI ALGEBRA GOLDEN SEQUENCES OF MATRICES WITH APPLICATIONS TO FIBONACCI ALGEBRA JOSEPH ERCOLANO Baruch College, CUNY, New York, New York 10010 1. INTRODUCTION As is well known, the problem of finding a sequence of

More information

Fiducial Inference and Generalizations

Fiducial Inference and Generalizations Fiducial Inference and Generalizations Jan Hannig Department of Statistics and Operations Research The University of North Carolina at Chapel Hill Hari Iyer Department of Statistics, Colorado State University

More information

Joint Probability Distributions and Random Samples (Devore Chapter Five)

Joint Probability Distributions and Random Samples (Devore Chapter Five) Joint Probability Distributions and Random Samples (Devore Chapter Five) 1016-345-01: Probability and Statistics for Engineers Spring 2013 Contents 1 Joint Probability Distributions 2 1.1 Two Discrete

More information

Variational Inference. Sargur Srihari

Variational Inference. Sargur Srihari Variational Inference Sargur srihari@cedar.buffalo.edu 1 Plan of Discussion Functionals Calculus of Variations Maximizing a Functional Finding Approximation to a Posterior Minimizing K-L divergence Factorized

More information

Derivatives and Integrals

Derivatives and Integrals Derivatives and Integrals Definition 1: Derivative Formulas d dx (c) = 0 d dx (f ± g) = f ± g d dx (kx) = k d dx (xn ) = nx n 1 (f g) = f g + fg ( ) f = f g fg g g 2 (f(g(x))) = f (g(x)) g (x) d dx (ax

More information

Measure-theoretic probability

Measure-theoretic probability Measure-theoretic probability Koltay L. VEGTMAM144B November 28, 2012 (VEGTMAM144B) Measure-theoretic probability November 28, 2012 1 / 27 The probability space De nition The (Ω, A, P) measure space is

More information

SEQUENTIAL TESTS FOR COMPOSITE HYPOTHESES

SEQUENTIAL TESTS FOR COMPOSITE HYPOTHESES [ 290 ] SEQUENTIAL TESTS FOR COMPOSITE HYPOTHESES BYD. R. COX Communicated by F. J. ANSCOMBE Beceived 14 August 1951 ABSTRACT. A method is given for obtaining sequential tests in the presence of nuisance

More information

Part 6: Multivariate Normal and Linear Models

Part 6: Multivariate Normal and Linear Models Part 6: Multivariate Normal and Linear Models 1 Multiple measurements Up until now all of our statistical models have been univariate models models for a single measurement on each member of a sample of

More information

Improved Inference for First Order Autocorrelation using Likelihood Analysis

Improved Inference for First Order Autocorrelation using Likelihood Analysis Improved Inference for First Order Autocorrelation using Likelihood Analysis M. Rekkas Y. Sun A. Wong Abstract Testing for first-order autocorrelation in small samples using the standard asymptotic test

More information

Lecture 19: Solving linear ODEs + separable techniques for nonlinear ODE s

Lecture 19: Solving linear ODEs + separable techniques for nonlinear ODE s Lecture 19: Solving linear ODEs + separable techniques for nonlinear ODE s Geoffrey Cowles Department of Fisheries Oceanography School for Marine Science and Technology University of Massachusetts-Dartmouth

More information

INTRODUCTION TO PATTERN RECOGNITION

INTRODUCTION TO PATTERN RECOGNITION INTRODUCTION TO PATTERN RECOGNITION INSTRUCTOR: WEI DING 1 Pattern Recognition Automatic discovery of regularities in data through the use of computer algorithms With the use of these regularities to take

More information

REVIEW OF DIFFERENTIAL CALCULUS

REVIEW OF DIFFERENTIAL CALCULUS REVIEW OF DIFFERENTIAL CALCULUS DONU ARAPURA 1. Limits and continuity To simplify the statements, we will often stick to two variables, but everything holds with any number of variables. Let f(x, y) be

More information

Stat 5102 Final Exam May 14, 2015

Stat 5102 Final Exam May 14, 2015 Stat 5102 Final Exam May 14, 2015 Name Student ID The exam is closed book and closed notes. You may use three 8 1 11 2 sheets of paper with formulas, etc. You may also use the handouts on brand name distributions

More information

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL

A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Discussiones Mathematicae Probability and Statistics 36 206 43 5 doi:0.75/dmps.80 A NOTE ON ROBUST ESTIMATION IN LOGISTIC REGRESSION MODEL Tadeusz Bednarski Wroclaw University e-mail: t.bednarski@prawo.uni.wroc.pl

More information

On consistency of Kendall s tau under censoring

On consistency of Kendall s tau under censoring Biometria (28), 95, 4,pp. 997 11 C 28 Biometria Trust Printed in Great Britain doi: 1.193/biomet/asn37 Advance Access publication 17 September 28 On consistency of Kendall s tau under censoring BY DAVID

More information

Lecture 4: Numerical solution of ordinary differential equations

Lecture 4: Numerical solution of ordinary differential equations Lecture 4: Numerical solution of ordinary differential equations Department of Mathematics, ETH Zürich General explicit one-step method: Consistency; Stability; Convergence. High-order methods: Taylor

More information

ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS

ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS ESTIMATING THE MEAN LEVEL OF FINE PARTICULATE MATTER: AN APPLICATION OF SPATIAL STATISTICS Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, N.C.,

More information

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be

Quantile methods. Class Notes Manuel Arellano December 1, Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Quantile methods Class Notes Manuel Arellano December 1, 2009 1 Unconditional quantiles Let F (r) =Pr(Y r). Forτ (0, 1), theτth population quantile of Y is defined to be Q τ (Y ) q τ F 1 (τ) =inf{r : F

More information

Theory and Methods of Statistical Inference. PART I Frequentist theory and methods

Theory and Methods of Statistical Inference. PART I Frequentist theory and methods PhD School in Statistics cycle XXVI, 2011 Theory and Methods of Statistical Inference PART I Frequentist theory and methods (A. Salvan, N. Sartori, L. Pace) Syllabus Some prerequisites: Empirical distribution

More information

Lecture 4. f X T, (x t, ) = f X,T (x, t ) f T (t )

Lecture 4. f X T, (x t, ) = f X,T (x, t ) f T (t ) LECURE NOES 21 Lecture 4 7. Sufficient statistics Consider the usual statistical setup: the data is X and the paramter is. o gain information about the parameter we study various functions of the data

More information

Statistical Models. ref: chapter 1 of Bates, D and D. Watts (1988) Nonlinear Regression Analysis and its Applications, Wiley. Dave Campbell 2009

Statistical Models. ref: chapter 1 of Bates, D and D. Watts (1988) Nonlinear Regression Analysis and its Applications, Wiley. Dave Campbell 2009 Statistical Models ref: chapter 1 of Bates, D and D. Watts (1988) Nonlinear Regression Analysis and its Applications, Wiley Dave Campbell 2009 Today linear regression in terms of the response surface geometry

More information

Introduction to Maximum Likelihood Estimation

Introduction to Maximum Likelihood Estimation Introduction to Maximum Likelihood Estimation Eric Zivot July 26, 2012 The Likelihood Function Let 1 be an iid sample with pdf ( ; ) where is a ( 1) vector of parameters that characterize ( ; ) Example:

More information

BFF Four: Are we Converging?

BFF Four: Are we Converging? BFF Four: Are we Converging? Nancy Reid May 2, 2017 Classical Approaches: A Look Way Back Nature of Probability BFF one to three: a look back Comparisons Are we getting there? BFF Four Harvard, May 2017

More information

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky

A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky A COMPARISON OF POISSON AND BINOMIAL EMPIRICAL LIKELIHOOD Mai Zhou and Hui Fang University of Kentucky Empirical likelihood with right censored data were studied by Thomas and Grunkmier (1975), Li (1995),

More information

Multivariate Analysis and Likelihood Inference

Multivariate Analysis and Likelihood Inference Multivariate Analysis and Likelihood Inference Outline 1 Joint Distribution of Random Variables 2 Principal Component Analysis (PCA) 3 Multivariate Normal Distribution 4 Likelihood Inference Joint density

More information

Massachusetts Institute of Technology Department of Economics Time Series Lecture 6: Additional Results for VAR s

Massachusetts Institute of Technology Department of Economics Time Series Lecture 6: Additional Results for VAR s Massachusetts Institute of Technology Department of Economics Time Series 14.384 Guido Kuersteiner Lecture 6: Additional Results for VAR s 6.1. Confidence Intervals for Impulse Response Functions There

More information

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints

A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Noname manuscript No. (will be inserted by the editor) A Recursive Formula for the Kaplan-Meier Estimator with Mean Constraints Mai Zhou Yifan Yang Received: date / Accepted: date Abstract In this note

More information

PROD. TYPE: COM ARTICLE IN PRESS. Computational Statistics & Data Analysis ( )

PROD. TYPE: COM ARTICLE IN PRESS. Computational Statistics & Data Analysis ( ) COMSTA 28 pp: -2 (col.fig.: nil) PROD. TYPE: COM ED: JS PAGN: Usha.N -- SCAN: Bindu Computational Statistics & Data Analysis ( ) www.elsevier.com/locate/csda Transformation approaches for the construction

More information

On prediction and density estimation Peter McCullagh University of Chicago December 2004

On prediction and density estimation Peter McCullagh University of Chicago December 2004 On prediction and density estimation Peter McCullagh University of Chicago December 2004 Summary Having observed the initial segment of a random sequence, subsequent values may be predicted by calculating

More information

Stat 5101 Notes: Algorithms

Stat 5101 Notes: Algorithms Stat 5101 Notes: Algorithms Charles J. Geyer January 22, 2016 Contents 1 Calculating an Expectation or a Probability 3 1.1 From a PMF........................... 3 1.2 From a PDF...........................

More information