Nonparametric Tests for Multi-parameter M-estimators John Robinson School of Mathematics and Statistics University of Sydney The talk is based on joint work with John Kolassa. It follows from work over a number of years with Chris Field, Elvezio Ronchetti and Alastair Young and generalizes Robinson, Ronchetti and Young (2003) and Field, Robinson and Ronchetti (2008). Typeset by FoilTEX
Abstract Tests of hypotheses concerning subsets of multivariate means or coefficients in linear or generalized linear models depend on parametric assumptions which may not hold. One nonparametric approach to these problems uses the standard nonparametric bootstrap using the test statistics derived from some parametric model but basing inferences on bootstrap approximations. We derive different test statistics based on empirical exponential families and use a tilted bootstrap to give inferences.the bootstrap approximations can be accurately approximated to relative second order accuracy by a saddlepoint approximation. This generalises earlier work in two ways. First, we generalise from bootstraps based on resampling vectors of both response and explanatory variables to include bootstrapping residuals for fixed explanatory variables, and second, we obtain a theorem for tail probabilities under weak conditions justifying approximation to bootstrap results for both cases. Typeset by FoilTEX 1
A likelihood based test. Let Y 1 (θ),, Y n (θ) be iid random vectors with Y j (θ) from a distribution F on the sample space Y. Suppose that θ satisfies E ψ j (Y j (θ), θ) = 0, (1) Let θ = (θ 1, θ 2 ), where θ 1 R p 1 and θ 2 R p 2, p 1 + p 2 = p, and consider a test of the null hypothesis H 0 : θ 2 = θ 20. If the common distribution of Y j (θ) belongs to some parametric model, then F belongs to a class of distributions such that (1) holds, with θ 2 = θ 20, and likelihood ratio tests may be used. Typeset by FoilTEX 2
Consider test statistics based on T = (T 1, T 2 ), the M-estimate of θ, the solution of ψ j (Y j (θ), T ) = 0, (2) where ψ j are assumed to be smooth functions from Y R p to R p. The functions ψ j are often from some parametric model, sometimes adjusted to make estimates robust. For the present we ignore robustness. We will propose a test statistic h(t 2 ). Typeset by FoilTEX 3
Three examples In linear models and generalized linear models with response variables Z j and explanatory variables X j. Correlation model: For iid random vectors Y j (θ) = (Z j, X j ), and the estimating equations are Y j (θ) = (Z j, X j ) ψ j (Y j (θ), t) = X j (Z j t T X j ) = 0. Typeset by FoilTEX 4
Regression model: For fixed X j = x j, as in the case of a designed experiment. Here the residuals, Y j (θ) = Z j θ T x j, are iid random variables and the estimating equations are ψ j (Y j (θ), t) = = x j (Y j (θ) + (θ t) T x j ) x j (Z j t T x j ) = 0. Typeset by FoilTEX 5
Generalized Linear Models: Poisson case Response variables Z j, for j = 1,..., n such that EZ j = µ j and var(z j ) = V (µ j ) for j = 1,..., n and η j = log(µ j ) = X T j θ, i = 1,..., n, Consider only the correlation case with Y j (θ) = (Z j, X j ) iid random variables. The estimating equations are ψ j (Y j (θ), t) = X j (Z j e XT j t ) = 0. Typeset by FoilTEX 6
Test statistic for parametric model: Y j (θ) has distribution F. Put nk(τ, t) = K j (τ, t) = log E[exp(τ T ψ j (Y j (θ), t))]. (3) Consider a test statistic based on the function h(.) defined by h(t 2 ) = inf t 1 sup{ K(τ, t)} = K(τ(t(t 2 )), t(t 2 )), (4) τ where t(t 2 ) = (t 1 (t 2 ), t 2 ) for τ(t) = arg sup[ K(τ, t)] and t 1 (t 2 ) = arg inf[ K(τ(t), t)]. τ t 1 Note h(θ 20 ) = 0, h (θ 20 ) = 0 and h (θ 20 ) > 0, so h(t 2 ) 0. Typeset by FoilTEX 7
Tilted bootstrap: F and so K unknown. Consider weighted empirical distributions ˆF (x) = w k I(Y k (θ) x), where θ = (θ 1, θ 20 ) and w k minimize subject to w k log(nw k ), w k 1 n ψ j (Y k (θ), θ) = 0 and w k = 1, (5) Typeset by FoilTEX 8
So we find stationary values of w k log(nw k ) β T n w k 1 n ψ j (Y k (θ), θ) + γ ( n w k 1 ) (6) with respect to w k, β, γ and θ 1, with θ 2 = θ 20. Equating derivatives wrt w k to 0 gives w k = 1 n exp(βt n ) ψ j (Y k (θ), θ)/n κ(β, θ), (7) where κ(β, θ) = log 1 n n exp (β T ) ψ j (Y k (θ), θ)/n. (8) Typeset by FoilTEX 9
Then (6) reduces to w k log(nw k ) = κ(β, θ). k So the minimum of (6) under the constraints is κ(β(θ 0 ), θ 0 ), where θ 0 = (θ 1 (θ 20 ), θ 20 ), β(θ) = arg sup β and the minimizing weights are n ŵ k = exp (β(θ 0 ) T κ(β, θ) and θ 1 (θ 20 ) = arg inf θ 1 κ(β(θ), θ) ) ψ j (Y k (θ 0 ), θ 0 )/n κ(β(θ 0 ), θ 0 ). Typeset by FoilTEX 10
Test statistic for the tilted bootstrap. To test H 0 : θ 2 = θ 20, take the empirical exponential family, ˆF (y) = ŵ k I{Y k (θ 0 ) y}. (9) Draw Y1,, Yn from ˆF, then n ψ j(yj, t) has cumulant generating function n ˆK(τ; t) = log ŵ k exp ( τ T ψ j (Y k (θ 0 ); t) ), (10) Typeset by FoilTEX 11
Then the test statistic is based on the function ĥ(t 2 ) = inf t 1 where t(t 2 ) = (t 1 (t 2 ), t 2 ), sup τ [ ˆK(τ; ] t) ĥ(.) defined by = ˆK(τ(t(t 2 )); t(t 2 )), (11) τ(t) = arg sup[ ˆK(τ, t)] and t 1 (t 2 ) = arg inf[ ˆK(τ(t), t)]. τ t 1 Again ĥ(θ 20) = 0, ĥ (θ 20 ) = 0 and ĥ (θ 20 ) > 0, so ĥ(t 2) 0. Obtain B tilted bootstrap samples and for each obtain T = (T1, T2 ) the solution of n ψ j(yj, t) = 0. Calculate ĥ(t 2 ) and the p-value for the test is the proportion of ĥ(t 2 ) greater than ĥ(t 2). Note that ĥ(t 2 ) needs to be calculated from (11) for each T2. Typeset by FoilTEX 12
A general saddlepoint approximation for multivariate M-estimates Under mild assumptions on existence of K(τ, t) and smoothness of scores, we have two approximations, the generalised Lugannani-Rice form [ ] P (2h(T 2 ) u 2 ) = Q p2 (nu 2 )[1 + O(n 1 )] + n 1 c n u p 2 e nu2 /2 G(u) 1 and the generalised Barndorff-Nielsen form where û = u log G(u)/nu. P (2h(T 2 ) u 2 ) = Q p2 (nû 2 )[1 + O(n 1 )] u 2 Typeset by FoilTEX 13
Here Q p2 = 1 Q p2 is the distribution function of a chi-squared variate with p 2 degrees of freedom, c n a known constant, û = u log G(u)/nu, and G(u) = S p2 δ(u, s)ds = 1 + u 2 k(u), (12) for δ(u, s) a somewhat complicated function involving derivatives of the cumulant generating function and (r, s) are the polar coordinates of ĥ (θ 20 ) 1/2 (t 2 θ 20 ). These results may also be shown to hold for the bootstrap where here the Monte Carlo approximation requires heavy calculation at each step. Typeset by FoilTEX 14
Application to Special Cases For the correlation model for both linear models and generalized linear models, the values Y j (θ) are (Z j, X j ) and the score functions ψ j do not depend on j. So, replacing the general score function ψ j (Y j (θ), t) by X j (Z j t T X j ) and X j (Z j e tt X j ), respectively, we can proceed as in the general case. Typeset by FoilTEX 15
Regression model: For the regression model we will, without loss of generality, consider the case where the first element of x k is 1 and x = (1, 0,, 0) T. The first constraint of (5) is then w k (1, 0) T (Z k θ T x k ) = 0, leading to w k = 1 n eβ 1(Z k θ T x k ) κ(β 1,θ), from (7) and (8), where κ(β 1, θ) = log 1 n e β 1(Z k θ T x k ). If β = (β 1, β T 2 ) T, the p 1 vector β 2 is not estimable. Typeset by FoilTEX 16
Now we may take β(θ) = (β 1 (θ), 0) T as the solution to κ(β 1, θ) β 1 = (Z k θ T x k )e β 1(Z k θ T x k ) = 0. (13) Also, θ 0 is the solution to dκ(β 1 (θ), θ)/dθ 1 = 0, so dβ 1 (θ) dθ 1 1 n β 1 (θ) 1 n (Z k θ T x k )e β 1(θ)(Z k θ T x k ) x (1) k I( Z k θ T x k < b)e β 1(θ)(Z k θ T x k ) = 0, where x (1) k contains the first p 1 elements of x k. Typeset by FoilTEX 17
The first term here is zero from (13), so β 1 (θ 0 ) = 0 and it follows that w k = 1/n. We require that θ 0 satisfies n (Z k θ(θ 0 ) T x k ) = 0. This does not uniquely define θ 0, but we propose using the solution to x (1) k (Z k θ T 0 x k ) = 0. (14) Bootstrap replicates R1,, Rn empirical distribution are obtained by sampling from the ˆF θ20 (x) = 1 n I(Z k θ T 0 x k x). (15) Typeset by FoilTEX 18
Then ψ j (Rj, t) = x j (Rj + (θ 0 t) T x j ). A Monte Carlo approximation to the tail probability of the test statistic can be obtained exactly as for the correlation case. The cumulant generating function of n ψ j(rj, t) is ˆK(τ, t) = log 1 n e τ T x j (Z k θ T 0 x k+(θ 0 t)x j ) and we can calculate the statistic ĥ(t 2) as in (11), where t = (t T 1, t T 2 ) T is the solution of ψ j (Y j (θ 0 ), t) = 0. Typeset by FoilTEX 19
http://www.maths.usyd.edu.au/u/johnr/spbs Accuracy of saddlepoint approximation Table 1: Comparison of bootstrap Monte Carlo results with saddlepoint approximations with an overdispersed negative binomial generated model, analysed as a Poisson model. û.2.3.4.5.6.7 BSMC 0.5150 0.2099 0.0567 0.0110 0.0024 0.0003 SPLR 0.5065 0.2121 0.0603 0.0119 0.0013 0.0001 SPBN 0.5043 0.2103 0.0596 0.0117 0.0013 0.0001 χ 2 2 0.4493 0.1653 0.0408 0.0067 0.0007 0.0001 Typeset by FoilTEX 20
Table 2: Comparison of bootstrap Monte Carlo results with saddlepoint approximation for linear regression under the correlation model. u 0.1 0.2 0.3 0.4 0.5 0.6 BSMC 0.9448 0.6937 0.3587 0.11267 0.0306 0.0050 SPLR 0.9490 0.7004 0.3643 0.1307 0.0323 0.0055 SPBN 0.9486 0.6982 0.3610 0.1284 0.0313 0.0052 χ 2 3 0.8607 0.5488 0.2592 0.0907 0.0235 0.0045 Typeset by FoilTEX 21
Table 3: Comparison of bootstrap Monte Carlo results with saddlepoint approximation for linear regression under the regression model. u 0.1 0.2 0.3 0.4 0.5 0.6 0.7 BSMC 0.8386 0.5216 0.2365 0.0806 0.0215 0.0031 0.0006 SPLR 0.8473 0.5200 0.2352 0.0793 0.0200 0.0038 0.0005 SPBN 0.8469 0.5194 0.2348 0.0791 0.0200 0.0038 0.0005 χ 2 3 0.8607 0.5488 0.2592 0.0907 0.0235 0.0045 0.0006 Typeset by FoilTEX 22
Figure 1: Accuracy of the bootstrap approximation Bootstrap test (solid)and parametric test (dashed) Poisson model Negative binomial model regression model Ordered p values 0.0 0.2 0.4 0.6 0.8 1.0 Ordered p values 0.0 0.2 0.4 0.6 0.8 1.0 Ordered p values 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 Typeset by FoilTEX 23
Power comparisons Table 4: Power of bootstrap and standard tests for simulations under Poisson and negative binomial models. (θ 3, θ 4 ) (0,0) (.1,.1) (.2,.2) (.3,.3) (.4,.4) (.5,.5) Poisson Model Bootstrap 0.04 0.09 0.32 0.70 0.96 0.99 Likelihood ratio 0.05 0.09 0.33 0.77 0.98 0.99 Negative Binomial Model Bootstrap 0.04 0.06 0.04 0.22 0.26 0.54 GLM Power 0.44 0.50 0.67 0.87 0.85 0.95 Typeset by FoilTEX 24
Table 5: Power of bootstrap and standard tests for simulations under regression model with errors exponential random variables to power 1.5 for 5 replicates of a 3 2 with null hypothesis of no interaction. (θ 3, θ 4 ) (0,0) (.2,.2) (.4,.4) (.6,.6) (.8,.8) (1,1) Bootstrap test 0.052 0.176 0.464 0.734 0.874 0.942 F test 0.040 0.152 0.428 0.694 0.846 0.932 Typeset by FoilTEX 25
Typeset by FoilTEX 26
A toy parametric example. Let Y 1j, Y 2j be independent exponentials with means θ 1 + θ 2, θ 2 for j = 1,, n and consider a test of H 0 : θ 2 = θ 20. Here F = (1 exp(y 1 /(θ 1 + θ 20 ))(1 exp(y 2 /θ 20 )). The estimating equations are (Y 1j t 1 t 2 ) = 0 and j (Y 2j t 2 ) = 0. j A simple integration gives K(τ, t) = τ 1 (t 1 + t 2 ) τ 2 t 2 log(1 τ 1 (θ 1 + θ 20 ))) log(1 τ 2 θ 20 ). Equating the derivative of this wrt τ, to 0 gives τ 1 (t) = (θ 1 + θ 2 ) 1 (t 1 + t 2 ) 1 and τ 2 (t) = θ 1 2 t 1 2. Typeset by FoilTEX 27
So K(τ(t), t) = 2 t 1 + t 2 θ 1 + θ 20 t 2 θ 20 + log t 1 + t 2 θ 1 + θ 20 + log t 2 θ 20. Now equating the derivative of this with respect to t 1 gives t 1 (t 2 ) + t 2 = θ 1 + θ 20 and so h(t 2 ) = K(τ(t(t 2 )), t(t 2 )) = 1 + t 2 θ 20 log t 2 θ 20. Note that h(θ 20 ) = 0, h (θ 20 ) = 0 and h (θ 20 ) = 1/θ 2 20. Typeset by FoilTEX 28