R package sae: Methodology

Size: px

Start display at page:

Download "R package sae: Methodology"

Elfreda Harmon
5 years ago
Views:

1 R package sae: Methoology Isabel Molina *, Yolana Marhuena March, 2015 Contents 1 Introuction 3 2 Function irect Sampling without replacement Sampling with replacement Function pssynt 5 4 Function ss 6 5 Function eblupfh FH fitting metho ML fitting metho REML fitting metho Function msefh FH fitting metho ML fitting metho REML fitting metho Function eblupsfh ML fitting metho REML fitting metho Function msesfh REML fitting metho ML fitting metho * Department of Statistics, Universia Carlos III e Mari. Aress: C/Mari 126, Getafe (Mari), Spain, Tf: , Fax: , isabel.molina@uc3m.es 1

2 9 Function pbmsesfh Function npbmsesfh Function eblupstfh REML fitting metho Function pbmsestfh Function eblupbhf Function pbmsebhf Function ebbhf Function pbmseebbhf 32 References 33 2

3 1 Introuction The R package sae estimates characteristics of the omains of a population, when a sample is available from the whole population, but not all the target omains have a sufficiently large sample size. This ocument escribes the methoology behin the functions of the package, giving the exact formulas an proceures implemente in each function. The following notation will be use through all this ocument. U enotes the target population, assume to be partitione into D subsets U 1,..., U D calle inistinctly omains or areas, of respective sizes N 1,..., N D. The measurement of the variable of interest for j-th iniviual within -th area is enote Y j an y = (Y 1,..., Y N ) is the vector with the measurements for -th area. The target omain parameters have the form δ = h (y ), = 1,..., D, for known real measurable functions h (). Particular cases of δ are the omain means δ = Ȳ = N 1 N j=1 Y j, = 1,..., D. These parameters are estimate using the information coming from a ranom sample s of size n rawn from U. Here, s is the subsample of size n from omain U, = 1,..., D, where n = D =1 n, an r = U s is the sample complement from omain, of size N n, = 1,..., D. 2 Function irect Function irect estimates the area means Ȳ, = 1,..., D, where, for each area, the estimator of Ȳ is calculate using only the sample ata s from that same omain. The obtaine estimators are unbiase with respect to the sampling esign (esign-unbiase). The call to the function is irect(y, om, sweight, omsize, ata, replace = FALSE) The particular estimator calculate by the function epens on the specifie arguments replace an sweight, relate to the sampling esign. If replace is TRUE, the sampling esign is assume to be with replacement an otherwise without replacement. The sampling weights shoul be introuce through the argument sweight, but when this argument is roppe, the function assumes simple ranom sampling (SRS). Uner SRS, the sizes of the omains N, = 1,..., D (omsize) are unnecessary. 2.1 Sampling without replacement Consier that the sample s is rawn without replacement within omain U, = 1,..., D. Let π j be the inclusion probability of j-th unit from -th omain in the corresponing omain sample s an let w j = π 1 j be the corresponing sampling 3

4 weight. The unbiase estimator of Ȳ is the Horvitz-Thompson estimator, given by ˆȲ DIR = N 1 Y j = N 1 w j Y j. π j s j j s Now let π,jk be the inclusion probability of the pair of units j an k from -th omain in the sample s. The sampling variance is given by DIR V π ( ˆȲ ) = 1 N Y 2 N N j Y j Y k (1 π N 2 j ) + 2 (π,jk π j π k ) π j π j π k. j=1 If π,jk > 0, (j, k), an unbiase estimator of this variance is given by DIR ˆV π ( ˆȲ ) = 1 j (1 π N 2 πj 2 j ) + 2 Y j Y k (π,jk π j π k ) π j s j π k π,jk. (1) j s Y 2 This estimator requires the secon orer inclusion probabilities, but many times these are not available. Then, it is common to fin an approximation of (1). A simple approximation is obtaine by consiering π,jk π j π k, which hols exactly in the case of Poisson sampling. This approximation makes the secon sum in (1) equal to zero an leas to the estimator DIR ˆV π ( ˆȲ ) = 1 N 2 j=1 k=1 k>j k s k>j 1 π j π 2 j s j Writing the approximate unbiase estimator of the variance in terms of the sampling weights w j (sweight), we get the expression provie by function irect when the argument sweight is specifie, DIR ˆV π ( ˆȲ ) = 1 N 2 Y 2 j. j. j s w j (w j 1)Y 2 Uner SRS without replacement, the result of the previous estimator oes not coincie with the usual unbiase estimator. Thus, when the argument sweight is roppe, the function irect assumes SRS without replacement an returns the usual unbiase estimators uner this esign ˆȲ = ȳ = n 1 j s Y j, ˆVπ ( DIR ˆȲ ) = (1 f ) S2, n is the omain quasi- where f = n /N is the omain sampling fraction an S 2 variance S 2 = (n 1) 1 j s (Y j ȳ ) 2. 4

5 2.2 Sampling with replacement Now consier the case in which s is rawn with replacement within omain U, = 1,..., D, with initial selection probabilities P j, j = 1,..., N. In this case, the unbiase estimator of the omain mean is given by ˆȲ DIR = N 1 j s Y j n P j. To obtain this estimator, the argument sweight must contain the vector of weights calculate as w j = (n P j ) 1, j s, = 1,..., D. Using these weights, the unbiase estimator of Ȳ calculate by the function irect with replace=true has exactly the same shape as that one in Section 2.1, i.e. ˆȲ DIR = N 1 j s w j Y j. The sampling variance of this estimator is given by V π ( DIR ˆȲ ) = 1 n N j=1 ( ) 2 Yj N P Ȳ P j. j an using again w j = (n P j ) 1, we obtain the unbiase variance estimator calculate by the function irect, DIR ˆV π ( ˆȲ ) = 1 ( ) 2 Yj DIR ˆȲ = 1 ( f w j Y j ˆȲ ) 2. n N j s P j n j s Uner SRS with replacement, the population sizes N (omsize) are not neee. Thus, when the argument omsize is roppe, the function assumes SRS an calculates the classical unbiase estimators ˆȲ DIR = ȳ = n 1 3 Function pssynt j s Y j, ˆVπ ( DIR ˆȲ ) = n 1 S2. Inirect estimators borrow strength from other omains by making assumptions establishing some homogeneity relationship among omains. The post-stratifie synthetic estimator is a basic inirect estimator. It assumes that ata are istribute into K groups (calle post-strata) that cut across the omains, an such that the within group mean is constant across omains. The groups are assume to have sufficient sample sizes to allow obtaining accurate irect estimates of the group means. These assumptions allow us to estimate a omain mean using a weighte combination of the (reliable) estimates of the group means. The function pssynt calculates post-stratifie synthetic estimates of the omain means Ȳ, = 1,..., D. The call to this function is 5

6 pssynt(y, sweight, ps, omsizebyps, ata) More specifically, the post-stratifie synthetic estimator consiers that there is a grouping variable (ps) which ivies the ata into K post-strata. The population count in the crossing between post-stratum k an omain, N k (omsizebyps), is assume to be known for all k an with N = K k=1 N k. Then, the mean of omain can be calculate as a weighte combination of the means in the crossings of omain with each post-strata Ȳk, k = 1,..., K, as follows Ȳ = 1 N K k=1 N k Ȳ k. (2) Uner the assumption of constant mean across omains within each post-stratum, Ȳ k = Ȳ+k, k = 1,..., K, where Ȳ+k enotes the mean of post-stratum k, an estimator of Ȳ can be obtaine by replacing Ȳ+k = Ȳk in (2) by some irect estimate of Ȳ+k, k = 1,..., K. Thus, we estimate the omain mean Ȳ using all the observations from the poststrata that cut across that omain. To estimate Ȳ+k, we consier the ratio HT estimator, given by ˆȲ +k = Ŷ +k DIR ˆN DIR +k, (3) where Ŷ +k DIR is the irect estimator of the total Y+k DIR in post-stratum k an ˆN +k DIR is the irect estimator of the population count in the same post-stratum, N +k, calculate using the sampling weights w j (sweight) of the units in that post-stratum. Replacing (3) for Ȳk in (2), we obtain the post-stratifie synthetic estimate returne by the function pssynt, ˆȲ SY N = 1 N K N k ˆȲ +k. k=1 4 Function ss The irect estimator of Ȳ is inefficient for a omain with a small sample size. On the other han, the post-stratifie synthetic estimator is biase when the means across omains within a post-stratum are not constant, which is likely to occur in practice. To balance the bias of a synthetic estimator an the instability of a irect estimator, Drew, Singh & Chouhry (1982) propose to take a weighte combination (or composition) of the two, with weight epening on the sample size of the omain. Thus, the function ss estimates the omain means Ȳ, = 1,..., D by a kin of composite estimators calle sample-size epenent estimators. The call to this function is ss(om, sweight, omsize, irect, synthetic, elta = 1, ata) 6

7 As mentione above, the sample-size epenent estimator is obtaine by composition of a irect estimator (irect) an a synthetic estimator (synthetic), both specifie by the user, that is, ˆȲ C = φ ˆȲ DIR SY N + (1 φ ) ˆȲ. The composition weight φ is obtaine as { DIR 1, ˆN δn φ = ; ˆN DIR DIR /(δn ), ˆN < δn, for N known (omsize) an for a given constant δ > 0 (elta). Thus, for a DIR omain with sample size large enough so that the estimate count ˆN is greater than δn, the sample size epenent estimator becomes the irect estimator ˆȲ DIR. Otherwise, it becomes the composition of the irect an the synthetic SY N estimator ˆȲ. The constant δ (elta) controls how much strength to borrow from other omains by attaching more or less weight to the synthetic estimator, with a large value of δ meaning to borrow more strength. 5 Function eblupfh Fay-Herriot (FH) moels were introuce by Fay & Herriot (1979) to obtain small area estimators of meian income in small places in the U.S. These moels are well known in the literature of small area estimation (SAE) an are the basic tool when only aggregate auxiliary ata at the area level are available. The function eblupfh estimates omain characteristics δ = h (y ), = 1,..., D, base on the mentione FH moel, an the call to this function is eblupfh(formula, varir, metho = "REML", MAXITER = 100, PRECISION= , ata) The basic FH moel is efine in two stages. First, since true values δ are DIR not observable, our ata will be the irect estimates ˆδ (left han sie of formula). These estimates have an error an this error might be ifferent for each area because samples sizes in the areas are generally ifferent. Thus, in the first stage, we assume the following moel representing the error of irect estimates, ˆδ DIR in = δ + e, e N(0, ψ ), = 1,..., D, (4) an referre to as sampling moel, where ψ is the sampling variance (varir) DIR of irect estimator ˆδ given δ, assume to be known for all. In a secon stage, true values δ are assume to be linearly relate with a vector of auxiliary variables (right han sie of formula), δ = x β + v, v ii N(0, A), = 1,..., D, (5) 7

8 where v is inepenent of e, = 1,..., D. This is calle the linking moel because it links all the areas through the common moel parameter β. Replacing (5) in (4), we obtain ˆδ DIR = x β + v + e, v ii N(0, A), e in N(0, ψ ), = 1,..., D, or equivalently, ˆδ DIR in N(x β, ψ + A), = 1,..., D. (6) In matrix notation, (6) may be written as y N{Xβ, Σ(A)}, where y = (ˆδ 1 DIR DIR,..., ˆδ D ), X = (x 1,..., x D ) an Σ(A) = iag(a + ψ 1,..., A + ψ D ). The best linear unbiase preictor (BLUP) of δ is given by δ = DIR ˆδ ψ {ˆδDIR x A + ψ β(a) } = {1 B (A)}ˆδ DIR + B (A) x β(a), (7) where B (A) = ψ /(A+ψ ) an β(a) is the maximum likelihoo (ML) estimator of β obtaine from (6) an also the weighte least squares (WLS) estimator of β without normality assumption, given by β(a) = {X Σ 1 (A)X} 1 X Σ 1 (A)y = { D } 1 D (A + ψ ) 1 x x (A + ψ ) 1 x ˆδDIR. (8) =1 =1 Expression (7) shows that δ is obtaine by composition of the irect estimator ˆδ an the regression synthetic estimator x β(a), with more weight DIR attache to the irect estimator when ψ is small relative to the total variance A + ψ, which means that the irect estimator is reliable, an more weight attache to the regression synthetic estimator x β when the irect estimator is not reliable enough an then more strength is require to borrow from the other omains. Since A is unknown, in practice it is replace by a consistent estimator Â. Several estimation methos (metho) for A are consiere incluing a moment estimator calle Fay-Herriot (FH) metho, maximum likelihoo (ML) an restricte (or resiual) ML (REML), see the next subsections. Substituting the obtaine estimator Â for A in the BLUP (7), we get the final empirical BLUP (EBLUP) returne by eblupfh, an given by ˆδ = δ (Â) = {1 B (Â)}ˆδ DIR + B (Â)x ˆβ, (9) where ˆβ = β(â). Function eblupfh elivers, together with the estimate moel coefficients, i.e. the components of ˆβ, their asymptotic stanar errors given by the iagonal 8

9 elements of the Fisher information epening on the specifie estimation metho (metho), the Z statistics obtaine by iviing the estimates by their stanar errors, an the p-values of the significance tests. Since for large D, the three possible estimators satisfy ˆβ N { β, I 1 (β) }, where I(β) is the Fisher information, then the Z statistic for a coefficient β j is Z j = ˆβ j / v jj, j = 1,..., p, where v jj is the estimate asymptotic variance of ˆβ j, given by the j-th element in the iagonal of I 1 ( ˆβ). Finally, for the test p-values are obtaine as H 0 : β j = 0 versus H 1 : β j 0, p-value = 2 P (Z > Z j ), where Z is a stanar normal ranom variable. Three ifferent gooness of fit measures are also elivere by function eblupfh. The first one is the estimate log-likelihoo l(â, ˆβ; y), obtaine by replacing the obtaine estimates Â an ˆβ in (12). The secon criteria is the Akaike Information Criteria (AIC), given in this case by AIC = 2 l(â, ˆβ; y) + 2(p + 1). Finally, the Bayesian Information Criteria (BIC) is obtaine as 5.1 FH fitting metho BIC = 2 l(â, ˆβ; y) + (p + 1) log(d). FH metho gives an estimator of A base on a moments metho originally propose by Fay & Herriot (1979). The FH estimator is given by ÂF H = max(0, A F H ) with A F H obtaine iteratively as the solution of the following nonlinear equation in A, D =1 (A + ψ ) 1 {ˆδ DIR x β(a)} 2 = D p. (10) This equation is solve using an iterative metho such as the Fisher-scoring algorithm. For this, let us efine S F H (A) = D =1 (A + ψ ) 1 {ˆδ DIR x β(a)} 2 D p. 9

10 By a first orer Taylor expansion of S F H (A F H ) aroun the true A, we get 0 = S F H (A F H ) S F H (A) + S F H(A) A (A F H A). (11) The Fisher-scoring algorithm replaces in this equation, the erivative S F H (A)/ A by its expectation E{ S F H (A)/ A}, which in this case is equal to E { S } F H(A) = A Then, solving for A F H in (11), we get A F H = A + D (A + ψ ) 1. =1 [ { E S }] 1 F H(A) S F H(A). A This algorithm is starte taking A = A (0) F H, an then in each iteration it upates the estimate of A with the upating equation A (k+1) F H { = A(k) F H [E + S } ] 1 F H(A) S F H(A (k) F H A ). A=A (k) F H In the function eblupfh, the starting value is set to A (0) F H = meian(ψ ). It stops either when the number of iterations k > MAXITER where MAXITER can be chosen by the user, or when A (k+1) F H A(k) F H A (k) F H Convergence of the iteration is generally rapi. 5.2 ML fitting metho < PRECISION. The moel parameters A an β can be estimate by ML or REML proceures base on the normal likelihoo (6). In fact, uner regularity conitions, the estimators erive from these two methos (an using the Normal likelihoo) remain consistent at orer O p (D 1/2 ) even without the Normality assumption, for etails see Jiang (1996). ML estimators of A an β are obtaine by maximizing the log-likelihoo, given by l(a, β; y) = c 1 2 log Σ(A) 1 2 (y Xβ) Σ 1 (A)(y Xβ), (12) where c enotes a constant. Taking erivative of l with respect to β an equating to zero, we obtain the equation that gives the ML (or WLS) estimator (8). The 10

11 ML equation for A is obtaine taking erivative of l with respect to A an equating to zero, an is given by D =1 (A + ψ ) 2 {ˆδ DIR x β(a)} 2 = D (A + ψ ) 1. (13) =1 Again, Fisher-scoring algorithm is use to solve this equation. efine as S ML (A) = l(a, β; y)/ A an is given by The score is S ML (A) = D =1 (A + ψ ) 2 {ˆδ DIR x β(a)} 2 D (A + ψ ) 1. =1 The Fisher information for A is obtaine by taking expectation of the negative erivative of S ML (A), an is given by I ML (A) = E { S } ML(A) = 1 A 2 Finally, the upating equation for the ML estimator of A is A (k+1) ML D (A + ψ ) 2. (14) =1 { } 1 = A(k) ML + I ML (A (k) ML ) SML (A (k) ML ). Starting value A (0) ML an stopping criterion are the same as in the FH metho escribe above. If A ML is the estimate obtaine in the last iteration of the algorithm, then the final ML estimate returne by eblupfh is ÂML = max(0, A ML ). 5.3 REML fitting metho The REML estimator of A is obtaine by maximizing the so calle restricte likelihoo, which is the joint p..f. of a vector of D p inepenent contrasts F y of the ata y, where F is an D (D p) full column rank matrix satisfying F F = I D p an F X = 0 (D p) p. The restricte likelihoo of F y oes not epen on β, an the log-restricte likelihoo is It hols that where l R (A; y) = c 1 2 log F Σ(A)F 1 2 y F {F Σ(A)F} 1 F y. F {F Σ(A)F} 1 F = P(A), P(A) = Σ 1 (A) Σ 1 (A)X { X Σ 1 (A)X } 1 X Σ 1 (A). Using this relation, we obtain l R (A; y) = c 1 2 log F Σ(A)F 1 2 y P(A)y. 11

12 The score is efine as S R (A) = l R (A; y)/ A, an is given by S R (A) = 1 2 trace{p(a)} y P 2 (A)y D [ {X = (A + ψ ) 1 trace Σ 1 (A)X } ] 1 X Σ 2 (A)X = {y X β(a)} Σ 2 (A){y X β(a)} = D (A + ψ ) 1 =1 D =1 D =1 x {ˆδ DIR x β(a)} 2 (A + ψ ) 2. { D k=1 (A + ψ k) 1 x k x k} 1 x (A + ψ ) 2 The REML estimator of A is obtaine by solving the non-linear equation S R (A) = 0. Again, application of Fisher-scoring algorithm requires also the Fisher information for A associate with l R (A; y), which is given by I R (A) = { E S } R(A) = 1 A 2 trace{p2 (A)} = 1 2 trace{σ(a) 2 } trace [ {X Σ 1 (A)X} 1 X Σ 3 (A)X ] Finally, the upating equation is trace ( [{X Σ 1 (A)X} 1 X Σ 3 (A)X ] 2 ). { } 1 A (k+1) REML = A(k) REML + I R (A (k) REML ) SR (A (k) REML ). Starting value A (0) REML an stopping criterion are the same as in FH an ML methos. Again, if A REML is the value obtaine in the last iteration, then the REML estimate is finally ÂREML = max(0, A REML ). 6 Function msefh The accuracy of an EBLUP ˆδ is usually assesse by the estimate MSE. Function msefh accompanies the EBLUPs with their corresponing estimate MSEs. The call to this function is msefh(formula, varir, metho = "REML", MAXITER = 100, PRECISION = , ata) where the arguments are exactly those of function eblupfh. 12

13 where Uner moel (4) (5), the MSE of the BLUP for A known is given by MSE( δ ) = E( δ δ ) 2 = g 1 (A) + g 2 (A), g 1 (A) = ψ {1 B (A)}, (15) { D 1 g 2 (A) = {B (A)} 2 x (A + ψ ) 1 x x } x, (16) where g 1 (A) is ue to the preiction of the ranom effect v an is O(1) for large D, an g 2 (A) is ue to the estimation of β an is O(D 1 ). This means that, for large D, a large reuction in MSE over MSE(ˆδ DIR ) = ψ can be obtaine when 1 B (A) = A/(A + ψ ) is small. Uner normality of ranom effects an errors, the MSE of the EBLUP satisfies =1 MSE(ˆδ ) = MSE(ˆδ ) + E(ˆδ δ ) 2 = [g 1 (A) + g 2 (A)] + g 3 (A) + o(d 1 ), where g 3 (A) is the uncertainty arising from the estimation of A, given by (17) g 3 (A) = {B (A)} 2 (A + ψ ) 1 V ( Â), (18) where V (Â) is the asymptotic variance (as D ) of the estimator Â of A. Thus, g 3 (A) epens on the choice of estimator Â but for the three available fitting methos FH, ML an REML, this term is O(D 1 ) (Prasa & Rao, 1990). The MSE of the EBLUP epens on the true variance A, which is unknown. If we want to have an unbiase estimator of the MSE up to o(d 1 ) terms (or secon orer unbiase), the MSE estimator epens on the metho use to fin Â in the EBLUP (metho). The following subsections escribe the MSE estimates returne by msefh for each selecte fitting metho. 6.1 FH fitting metho When using the FH estimator ÂF H, a secon orer unbiase estimator of MSE(ˆδ ) using ÂF H is given by mse F H (ˆδ ) = g 1 (ÂF H) + g 2 (ÂF H) + 2 g 3 (ÂF H) b F H (ÂF H){B (ÂF H)} 2, (19) where, in g 3 (ÂF H), the asymptotic variance is an b F H (A) = V (ÂF H) = see Datta, Rao & Smith (2005). 2D { D } 2 (20) =1 (A + ψ ) 1 [ 2 D { D =1 (A + ψ D } ] 2 ) 2 =1 (A + ψ ) 1 { D } 3, (21) =1 (A + ψ ) 1 13

14 6.2 ML fitting metho When using the ML estimator ÂML of A, a secon orer unbiase estimator of MSE(ˆδ ) was obtaine by Datta & Lahiri (2000) an is given by mse ML (ˆδ ) = g 1 (ÂML) + g 2 (ÂML) + 2g 3 (ÂML) b ML (ÂML) g 1 (ÂML), (22) where the asymptotic variance involve in g 3 (A) is equal to the inverse of the Fisher information given in (14), V (ÂML) = I 1 ML (A) = 2 D =1 (A + ψ, (23) 2 ) [ { D } 1 { trace l=1 (A + ψ D } ] l) 1 x l x l l=1 (A + ψ l) 2 x l x l b ML (A) = D l=1 (A + ψ l) 2 an g 1 (A) = g { } 2 1(A) A = ψ. A + ψ 6.3 REML fitting metho When using the REML estimator ÂREML, a secon orer unbiase estimator of MSE(ˆδ ) is given by mse REML (ˆδ ) = g 1 (ÂREML) + g 2 (ÂREML) + 2g 3 (ÂREML), (24) where V (ÂREML) = V (ÂML) is given in (23), see Datta & Lahiri (2000). 7 Function eblupsfh Function eblupsfh estimates omain parameters δ = h (y ), = 1,..., D, base on a FH moel with spatially correlate area effects. The call to the function is eblupsfh(formula, varir, proxmat, metho = "REML", MAXITER = 100, PRECISION = , ata) The moel consiere by this function is, in matrix notation, y = Xβ + v + e, (25) where y = (ˆδ 1 DIR DIR,..., ˆδ D ) is the vector of irect estimates for the D small areas (left han sie of formula), X = (x 1,..., x D ) is a D p matrix containing in its columns the values of p explanatory variables for the D areas (right han sie of formula), v = (v 1,..., v D ) is the vector of area effects an e = (e 1,..., e D ) is the vector of inepenent sampling errors, inepenent of v, with e N(0 D, Ψ), 14

15 where the covariance matrix Ψ = iag(ψ 1,..., ψ D ) is known (varir contains the iagonal elements). The vector δ = Xβ+v = (δ 1,..., δ D ) collects the target omain parameters. The vector v follows an simultaneously autoregressive (SAR) process with unknown autoregression parameter ρ ( 1, 1) an proximity matrix W, i.e., v = ρwv + u, (26) see Anselin (1988) an Cressie (1993). Moel (25) (26) will be calle hereafter spatial FH (SFH) moel. We assume that the matrix (I D ρw) is non-singular, where I D enotes the D D ientity matrix. Then v can be expresse as v = (I D ρw) 1 u, (27) where u = (u 1,..., u D ) satisfies u N(0 D, A I D ) for A unknown. The matrix W (proxmat) is obtaine from an original proximity matrix W 0, whose iagonal elements area equal to zero an the remaining entries are equal to 1 when the two areas corresponing to the row an the column inices are consiere as neighbor an zero otherwise. Then W is obtaine by row-stanarization of W 0, obtaine by iviing each entry of W 0 by the sum of elements in the same row, see Anselin (1988), Cressie (1993) an Banerjee, Carlin & Gelfan (2004) for more etails on the SAR(1) process with the above parametrization. When W is efine in this fashion, ρ is calle spatial autocorrelation parameter (Banerjee, Carlin & Gelfan, 2004). Hereafter, the vector of variance components will be enote θ = (θ 1, θ 2 ) = (A, ρ). Equation (27) implies that v has mean vector 0 an covariance matrix equal to G(θ) = A {(I D ρw) (I D ρw)} 1. (28) Since e is inepenent of v, the covariance matrix of y is equal to Σ(θ) = G(θ) + Ψ. Combining (25) an (27), the moel is y = Xβ + (I D ρw) 1 u + e (29) The BLUP of δ = x β + v uner moel (29) is calle Spatial BLUP (Petrucci & Salvati, 2006) an is given by δ (θ) = x β(θ) + b G(θ)Σ 1 (θ){y X β(θ)}, (30) where β(θ) = {X Σ 1 (θ)x} 1 X Σ 1 (θ)y is the WLS estimator of β an b is the 1 vector (0,..., 0, 1, 0,..., 0) with 1 in the -th position. The Spatial BLUPs δ (θ), = 1,..., D, epen on the unknown vector of variance components θ = (A, ρ). Replacing a consistent estimator ˆθ = (Â, ˆρ) for θ in (30), we obtain the Spatial EBLUPs ˆδ = δ ( ˆθ), = 1,..., D, returne by function eblupsfh. The following subsections escribe the two fitting methos (metho) for the SFH moel (25) (26) supporte by eblupsfh. 15

16 7.1 ML fitting metho The ML estimator of θ = (A, ρ) is obtaine by maximizing the log-likelihoo of θ given the ata vector y, l(θ; y) = c 1 2 log Σ(θ) 1 2 (y Xβ) Σ 1 (θ)(y Xβ), where c enotes a constant. The Fisher-scoring iterative algorithm is applie to maximize this log-likelihoo. Let S(θ) = (S A (θ), S ρ (θ)) be the scores or erivatives of the log-likelihoo with respect to A an ρ, an let I(θ) be the Fisher information matrix obtaine from l(θ; y), with elements ( ) IA,A (θ) I I(θ) = A,ρ (θ). (31) I ρ,a (θ) I ρ,ρ (θ) The Fisher-scoring algorithm starts with an initial estimate θ (0) = (σ 2(0) u, ρ (0) ) an then at each iteration k, this estimate is upate with the equation The ML equation for β yiels Let us enote an θ (k+1) = θ (k) + I 1 (θ (k) )S(θ (k) ). β(θ) = { X Σ 1 (θ)x } 1 X Σ 1 (θ)y. (32) C(ρ) = (I D ρw) (I D ρw) P(θ) = Σ 1 (θ) Σ 1 (θ)x { X Σ 1 (θ)x } 1 X Σ 1 (θ). (33) Then the erivative of C(ρ) with respect to ρ is C(ρ) ρ = W W + 2ρW W an the erivatives of Σ(θ) with respect to A an ρ are respectively given by Σ(θ) A = C 1 (ρ), Σ(θ) ρ = A C 1 (ρ) C(ρ) ρ C 1 (ρ) R(θ). The scores associate to A an ρ, after replacing (32), are given by S A (θ) = 1 2 trace { Σ 1 (θ)c 1 (ρ) } y P(θ)C 1 (ρ)p(θ)y, S ρ (θ) = 1 2 trace { Σ 1 (θ)r 1 (θ) } y P(θ)R(θ)P(θ)y. 16

17 The elements of the Fisher information matrix are I A,A (θ) = 1 2 trace { Σ 1 (θ)c 1 (ρ)σ 1 (θ)c 1 (ρ) }, (34) I A,ρ (θ) = I ρ,a = 1 2 trace { Σ 1 (θ)r(θ)σ 1 (θ)c 1 (ρ) }, (35) I ρ,ρ (θ) = 1 2 trace { Σ 1 (θ)r(θ)σ 1 (θ)r(θ) }. (36) In the function eblupsfh, the starting value of A is set to A (0) = meian(ψ ). For ρ, we take ρ (0) = 0.5. The algorithm stops either when the number of iterations k > MAXITER where MAXITER can be chosen by the user, or when max { σ 2(k+1) u σ 2(k) u σ 2(k) u 7.2 REML fitting metho }, ρ (k+1) ρ (k) < PRECISION. The REML estimator of θ is obtaine by maximizing the restricte likelihoo efine as in Section 5.3. Uner the SFH moel, the restricte log-likelihoo is given by ρ (k) l R (θ; y) = c 1 2 log F Σ(θ)F 1 2 y P(θ)y, where P(θ) is efine in (33). Using the following properties of the matrix P(θ), P(θ)Σ(θ)P(θ) = P(θ), P(θ) θ r = P(θ) Σ(θ) P(θ), θ r we obtain the scores or erivatives of l R (θ; y) with respect to each element of θ, SA(θ) R = 1 2 trace { P(θ)C 1 (ρ) } y P(θ)C 1 (ρ)p(θ)y, Sρ R (θ) = 1 2 trace {P(θ)R(θ)} y P(θ)R(θ)P(θ)y, Finally, the elements of the Fisher information obtaine from l R (θ; y) are IA,A(θ) R = 1 2 tr{p(θ)c 1 (ρ)p(θ)c 1 (ρ)}, IA,ρ(θ) R = Iρ,A R = 1 2 tr{p(θ)r(θ)p(θ)c 1 (ρ)}, Iρ,ρ(θ) R = 1 2 tr{p(θ)r(θ)p(θ)r(θ)}. The upating equation of the Fisher-scoring algorithm is then θ (k+1) = θ (k) + { I R (θ (k) ) } 1 S R (θ (k) ), 17

18 where S R (θ) = (SA R(θ), SR ρ (θ)) is the scores vector an ( I I R R (θ) = A,A (θ) IA,ρ R (θ) ) Iρ,A R (θ). (37) IR ρ,ρ(θ) is the Fisher information matrix. Starting values an stopping criterion are set the same as in the case of ML estimates. 8 Function msesfh Function msesfh gives estimate MSEs of the Spatial EBLUPs uner the SFH moel (25) (26). The call to the function is msesfh(formula, varir, proxmat, metho = "REML", MAXITER = 100, PRECISION = , ata) which has the same arguments as function eblupsfh. Similarly as in Section 6, uner normality of ranom effects an errors, the MSE of the Spatial EBLUP can be ecompose as MSE{ δ ( ˆθ)} = MSE{ δ (θ)} + E{ δ ( ˆθ) δ (θ)} 2 = [g 1 (θ) + g 2 (θ)] + g 3 (θ), (38) where the first two terms on the right han sie are easily calculate ue to the linearity of the Spatial BLUP δ (θ) in the ata vector y. They are given by { g 1 (θ) = b G(θ) G(θ)Σ 1 (θ)g(θ) } b, (39) { g 2 (θ) = b ID G(θ)Σ 1 (θ) } X(X Σ 1 (θ)x) 1 X { I D Σ 1 (θ)g(θ) } b. (40) However, for the last term g 3 (θ) = E{ δ ( ˆθ) δ (θ)} 2, an exact analytical expression oes not exist ue to the non-linearity of the EBLUP δ ( ˆθ) in y. Uner the basic FH moel (4)-(5) with inepenent ranom effects v (iagonal covariance matrix Σ), Prasa & Rao (1990) obtaine an approximation up to o(d 1 ) terms of g 3 (θ) through Taylor linearization, see Section 6. Their formula can be taken as a naive approximation of the true g 3 (θ) uner the SFH moel (25) (26). Straightforwar application of this formula to moel (25) (26) yiels g P R 3 (θ) = trace { L (θ)σ(θ)l (θ)i 1 (θ) }, (41) where I(θ) is the Fisher information efine in (31) with elements (34) (36) an ( ) b L (θ) = {C 1 (ρ)σ 1 (θ) AC 1 (ρ)σ 1 (θ)c 1 (ρ)σ 1 (θ)} b {R(θ)Σ 1 (θ) AC 1 (ρ)σ 1 (θ)r(θ)σ 1. (θ)} Then the full MSE can be approximate by MSE P R { δ ( ˆθ)} = g 1 (θ) + g 2 (θ) + g P R 3 (θ). (42) 18

19 Singh, Shukla & Kunu (2005) arrive to the same formula (42) for the true MSE uner a SFH moel. However, this formula is not accounting for the extra uncertainty ue to estimation of the spatial autocorrelation parameter ρ. Next subsections escribe the MSE estimates returne by function msesfh, epening on the specifie fitting metho. 8.1 REML fitting metho Let ˆθ REML be the estimator of ˆθ when REML fitting metho is specifie in msesfh. In that case, the function msesfh returns the MSE estimator erive by Singh, Shukla & Kunu (2005) an given by mse SSK [ δ ( ˆθ REML )] = g 1 ( ˆθ REML ) + g 2 ( ˆθ REML ) + 2g3 P R ( ˆθ REML ) g 4 ( ˆθ REML ), (43) where g 1, g 2 an g3 P R are given respectively in (39), (40) an (41), an g 4 (θ) reas g 4 (θ) = r=1 s=1 b Ψ Σ 1 (θ) 2 Σ(θ) θ r θ s Σ 1 (θ)ψ v rs (θ) b, where v rs (θ) enotes the element (r, s) of I(θ) 1, for the Fisher information matrix I(θ) efine in (31) with elements (34) (36). The secon orer erivatives of Σ(θ) are given by 2 Σ(θ) = 0 A 2 D D, 2 Σ(θ) A ρ = 2 Σ(θ) A ρ 2 Σ(θ) ρ 2 = 2AC 1 (θ) C(ρ) ρ 8.2 ML fitting metho = C 1 (θ) C(ρ) ρ C 1 (θ), C 1 (θ) C(ρ) ρ C 1 (θ) 2AC 1 (θ)w WC 1 (θ). Let ˆθ ML be the estimator of ˆθ when ML fitting metho is specifie. In that case, the estimate returne by msesfh reas mse SSK ML { δ ( ˆθ)} = g 1 ( ˆθ ML ) + g 2 ( ˆθ ML ) + 2g3 P R ( ˆθ ML ) g 4 ( ˆθ ML ) b ML( ˆθ ML ) g 1 ( ˆθ ML ). In this expression, g 1, g 2 an g3 P R are efine respectively in (39), (40) an (41), g 1 (θ) = ( g 1 (θ)/ A, g 1 (θ)/ ρ) is the graient of g 1 (θ) with erivatives g 1 (θ) A g 1 (θ) ρ = b { C 1 (ρ) 2 A C 1 (ρ)σ 1 (θ)c 1 (ρ) + A 2 C 1 (ρ)σ 1 (θ)c 1 (ρ)σ 1 (θ)c 1 (ρ) } b, = b { R(θ) 2 A C 1 (ρ)σ 1 (θ)r(θ) A 2 C 1 (ρ)σ 1 (θ)r(θ)σ 1 (θ)c 1 (ρ) } b, 19

20 an finally, b ML ( ˆθ ML ) is the bias of the ML estimator ˆθ ML up to o(d 1 ) terms, given by b ML ( ˆθ) = I 1 ( ˆθ ML )h( ˆθ ML )/2 with h( ˆθ ML ) = (h 1 ( ˆθ ML ), h 2 ( ˆθ ML )) an [ {X h 1 (θ) = trace Σ 1 (θ)x } ] 1 X Σ 1 (θ)c 1 (ρ)σ 1 (θ)x, [ {X h 2 (θ) = trace Σ 1 (θ)x } ] 1 X Σ 1 (θ)r(θ)σ 1 (θ)x. 9 Function pbmsesfh Function pbmsesfh gives parametric bootstrap estimates of the MSEs of the Spatial EBLUPs uner the SFH moel (25) (26) using an extension of González- Manteiga et al. (2008b). The MSE estimators obtaine by this proceure are expecte to be consistent if the moel parameter estimates are consistent. The call to the function is pbmsesfh(formula, varir, proxmat, B = 100, metho = "REML", MAXITER = 100, PRECISION = , ata) where the arguments are those of function eblupsfh, an aitionally the number of bootstrap replicates B. The bootstrap proceure procees as follows: 1) Fit the SFH moel (25) (26) to the initial ata y = (ˆδ 1 DIR,..., obtaining moel parameter estimates ˆθ = (Â, ˆρ) an ˆβ = β( ˆθ). ˆδ DIR D ), 2) Generate a vector t 1 whose elements are D inepenent copies of a N(0, 1). Construct bootstrap vectors u = Â1/2 t 1 an v = (I D ˆρW) 1 u, an calculate the bootstrap quantity of interest δ = X ˆβ + v, by regaring ˆβ an ˆθ as the true values of the moel parameters. 3) Generate a vector t 2 with D inepenent copies of a N(0, 1), inepenently of the generation of t 1, an construct the vector of bootstrap ranom errors e = Ψ 1/2 t 2. 4) Obtain bootstrap ata applying the moel y = δ + e = X ˆβ + v + e. 5) Regaring ˆβ an ˆθ as the true values of β an θ, fit the SFH moel (25) (26) to bootstrap ata y, obtaining estimates of the true ˆβ an ˆθ base on y. For this, first calculate the estimator of ˆβ evaluate at the true value ˆθ, β ( ˆθ) { ˆθ)X} 1 = X Σ 1 ( X Σ 1 ( ˆθ)y ; next, obtain the estimator ˆθ of ˆθ base on y an, finally, the estimator of ˆβ evaluate at ˆθ, that is, β ( ˆθ ). 6) Calculate the bootstrap Spatial BLUP from bootstrap ata y an regaring ˆθ as the true value of θ, δ ( ˆθ) = x β ( ˆθ) + b G( ˆθ)Σ( ˆθ) {y 1 X β ( ˆθ) }. 20

21 Calculate also the bootstrap Spatial EBLUP replacing ˆθ for the true ˆθ, δ ( ˆθ ) = x β ( ˆθ ) + b G( ˆθ )Σ 1 ( ˆθ )[y X β ( ˆθ )]. 7) Repeat steps 2) 6) B times. In b-th bootstrap replication, let δ (b) quantity of interest for -th area, ˆθ (b) the bootstrap estimate of θ, the bootstrap Spatial BLUP an for -th area. δ (b) 8) A parametric bootstrap estimator of g 3 (θ) is g P B 3 ( ˆθ) = B 1 B b=1 be the (b) δ ( ˆθ) ( ˆθ (b) ) the bootstrap Spatial EBLUP { δ (b) ( ˆθ (b) (b) ) δ ˆθ)} 2 (. Function pbmsesfh returns the naive parametric bootstrap estimator of the full MSE, given by mse nap B { δ ( ˆθ)} = B 1 B b=1 { δ (b) ( ˆθ } 2 (b) ) δ (b). (44) Function pbmsesfh also returns a bias-correcte MSE estimate obtaine as in Pfeffermann an Tiller (2005), an given by mse bcp B { δ ( ˆθ)} { = 2 g 1 ( ˆθ) + g 2 ( ˆθ) } + g3 P B ( ˆθ) B 1 10 Function npbmsesfh B b=1 { g 1 ( ˆθ (b) ) + g 2 ( ˆθ } (b) ). (45) Function npbmsesfh gives MSE estimates for the Spatial EBLUPs uner the SFH moel (25) (26), using the nonparametric bootstrap approach of Molina, Salvati & Pratesi (2009). The call to the function is npbmsesfh(formula, varir, proxmat, B = 100, metho = "REML", MAXITER = 100,PRECISION = , ata) where the arguments are the same as in pbmsesfh. The function resamples ranom effects {u 1,..., u D } an errors {e 1,..., e D } from the respective empirical istribution of preicte ranom effects {û 1,..., û D } an resiuals {ˆr 1,..., ˆr D }, DIR where r = ˆδ δ ( ˆθ), = 1,..., D, all previously stanarize. This metho avois the nee for istributional assumptions of u an e ; therefore, it is expecte to be more robust to non-normality of the ranom moel components. Uner moel (25) (26), the BLUPs of u an v are respectively given by ṽ(θ) = G(θ)Σ 1 (θ){y X β(θ)}, ũ(θ) = (I ρw)ṽ(θ), 21

22 an the covariance matrix of ũ(θ) is Σ u (θ) = (I ρw)g(θ)p(θ)g(θ)(i ρw ). Let us efine the vector of resiuals obtaine from the BLUP r(θ) = y X β(θ) ṽ(θ) = (ˆδ 1 DIR δ DIR 1 (θ),..., ˆδ D δ (θ)). It is easy to see that the covariance matrix of r(θ) is Σ r (θ) = Ψ P(θ)Ψ, for P(θ) efine in (33). The covariance matrices Σ u (θ) an Σ r (θ) are not iagonal; hence, the elements of the vectors ũ(θ) an r(θ) are correlate. Inee, both ũ(θ) an r(θ) lie in a subspace of imension D p. Since the methos that resample from the empirical istribution work well uner an ieally ii setup, before resampling a previous stanarization step is crucial. Here û = ũ( ˆθ) an ˆr = r( ˆθ) are transforme to achieve vectors that are as close as possible to be uncorrelate an with unit variance elements. We escribe the stanarization metho only for û, since for ˆr the process is analogous. Let us consier the estimate covariance matrix ˆΣ u = Σ u ( ˆθ). The spectral ecomposition of ˆΣ u is ˆΣ u = Q u u Q u, where u is a iagonal matrix with the D p non-zero eigenvalues of ˆΣ u an Q u is the matrix with the corresponing eigenvectors in the columns. Take 1/2 the square root matrix ˆΣ u = Q u 1/2 u Q u. Squaring this matrix gives a generalize inverse of ˆΣ u. With the obtaine square root, we transform û as û S = ˆΣ 1/2 u û. The covariance matrix of û S is then V (û S ) = Q u Q u, which is close to an ientity matrix. Observe that in the transformation û S = Q u 1/2 u Q uû, the vector Q uû contains the coorinates of û in its principal components, which are uncorrelate an with covariance matrix u. Then multiplying by 1/2 u, these coorinates are stanarize to have unit variance. Finally, this stanarize vector in the space of the principal components is returne to the original space by multiplying by Q u. Thus, the transforme vector û S contains the coorinates of the vector 1/2 u Q uû, with stanar elements, in the original space. The eigenvalues, which are the variances of the uncorrelate principal components, collect better the variability than the iagonals of ˆΣ u. Inee, simulations were inicate that taking simply û S = û / v, where v is the -th iagonal element of ˆΣ u, oes not work well. The final nonparametric bootstrap proceure is obtaine by replacing steps 2) an 3) in the parametric bootstrap 1) 8) of Section 9 by the new steps 2 ) an 3 ) given below: 22

23 2 ) With the estimates ˆθ = (Â, ˆρ) an ˆβ = β( ˆθ) obtaine in step 1), calculate preictors of v an u as follows ˆv = G( ˆθ)Σ( ˆθ) 1 (y X ˆβ), û = (I ˆρW)ˆv = (û 1,..., û m ). ˆΣ 1/2 ˆΣ 1/2 u Then take û S = u û = (û S 1,..., û S D ), where is the square root of the generalize inverse of ˆΣ u obtaine by the spectral ecomposition. It is convenient to re-scale the elements û S so that they have sample mean exactly equal to zero an sample variance Â. This is achieve by the transformation û SS = Â(û S D 1 D l=1 ûs l ) D 1 D =1 (ûs, = 1,..., D. D 1 D l=1 ûs l )2 Construct the vector u = (u 1,..., u D ), whose elements are obtaine by extracting a simple ranom sample with replacement of size D from the set {û SS 1,..., û SS D }. Then obtain v = (I ˆρW) 1 u an calculate the vector of bootstrap target parameters δ = X ˆβ + v = (δ1,..., δ ) 3 ) Compute the vector of resiuals ˆr = y X ˆβ ˆv = (ˆr 1,..., ˆr D ). Stanarize these resiuals as ˆr S 1/2 = ˆΣ r ˆr = (ˆr 1 S,..., ˆr D S ), where ˆΣ r = Ψ P( ˆθ)Ψ 1/2 is the estimate covariance matrix an ˆΣ r is a square root of the generalize inverse erive from the spectral ecomposition of ˆΣr. Again, re-stanarize these values ˆr SS = ˆr S m 1 D l=1 ˆrS l D 1 D =1 (ˆrS, = 1,..., D. D 1 D l=1 ˆrS l )2 Construct r = (r1,..., rd ) by extracting a simple ranom sample with replacement of size D from the set {ˆr 1 SS,..., ˆr D SS}. Then take e = (e 1,..., e D ), where e = ψ1/2 r, = 1,..., D. The function npbmsesfh yiels naive an bias-correcte nonparametric bootstrap estimators mse nanp B { δ ( ˆθ)} an mse bcnp B { δ ( ˆθ)} analogous to (44) an (45) respectively. 11 Function eblupstfh Function eblupstfh gives small area estimators of δ = h (y ), = 1,..., D, uner an extension of the FH moel that takes into account the spatial correlation between neighbor areas an also incorporates historical ata (Marhuena, Molina & Morales, 2013). The area parameter for omain at current time instant T is estimate borrowing strength from the T time instants an from the D omains. The call to the function is 23

24 eblupstfh(formula, D, T, varir, proxmat, moel = "ST", MAXITER = 100, PRECISION = , ata) Let θ t be the target area characteristic for area an time instant t, for = DIR 1,..., D an t = 1,..., T. Let ˆδ t be a irect estimator of δ t (left han sie of formula) an x t a column vector containing the aggregate values of p auxiliary variables relate linearly with δ t (right han sie of formula). The spatio-temporal FH (STFH) moel is state as follows. In the first stage, we assume ˆδ t DIR = δ t + e t, = 1,..., D, t = 1,..., T, (46) where, given δ t, sampling errors e t are assume to be inepenent an normally istribute with variances ψ t known for all an t (varir). In the secon stage, the target parameters for all omains an time points are linke through the moel δ t = x tβ + u 1 + u 2t, = 1,..., D, t = 1,..., T. (47) Here, the vectors of area-time ranom effects (u 21,..., u 2T ) are i.i.. for each area, following an AR(1) process with autocorrelation parameter ρ 2, that is, u 2t = ρ 2 u 2,t 1 + ɛ 2t, ρ 2 < 1, ɛ 2t ii N(0, σ 2 2). (48) The vector of area effects (u 11,..., u 1D ) follows a SAR(1) process with variance parameter σ 2 1, spatial autocorrelation ρ 1 an row-stanarize proximity matrix W = (w,l ) efine as in Section 7 (proxmat), that is, u 1 = ρ 1 l w,l u 1l + ɛ 1, ρ 1 < 1, ɛ 1 ii N(0, σ 2 1), = 1,..., D. (49) Let us efine the following vectors an matrices obtaine by stacking the elements of the moel in columns y = e = col ( col DIR (ˆδ t )), X = col ( col 1 D 1 t T 1 D 1 t T (x t)), col ( col (e t)), u 1 = 1 D 1 t T col (u 1) an u 2 = 1 D col ( col (u 2t). 1 D 1 t T Defining aitionally Z 1 = I D 1T where I D is the D D ientity matrix, 1 T is a vector of ones of size T an is the Kronecker prouct, Z 2 = I n, where n = DT is the total number of observations, u = (u 1, u 2) an Z = (Z 1, Z 2 ), the moel can be expresse as a general linear mixe moel in the form y = Xβ + Zu + e. Let θ = (σ 2 1, ρ 1, σ 2 2, ρ 2 ) be the vector of unknown parameters involve in the covariance matrix of y. Observe that here e N(0 n, Ψ), where 0 n enotes a vector of zeros of size n an Ψ is the iagonal matrix Ψ = iag 1 D (iag 1 t T (ψ t )). 24

25 Moreover, u N{0 n, G(θ)}, where the covariance matrix is the block iagonal matrix G(θ) = iag{σ 2 1Ω 1 (ρ 1 ), σ 2 2Ω 2 (ρ 2 )}, with Ω 1 (ρ 1 ) = { (I D ρ 1 W) (I D ρ 1 W) } 1, (50) Ω 2 (ρ 2 ) = iag 1 D {Ω 2 (ρ 2 )}, 1 ρ 2... ρ T 2 2 ρ T 2 1. Ω 2 (ρ 2 ) = 1 ρ ρ T ρ ρ T ρ2 ρ T 2 1 ρ2 T 2... ρ 2 1 Thus, the covariance matrix of y is given by Σ(θ) = ZG(θ)Z + Ψ. T T, = 1,..., D. (51) The WLS estimator of β an the (componentwise) BLUP of u obtaine by Henerson (1975) are given by β(θ) = { X Σ 1 (θ)x } 1 X Σ 1 (θ)y, ũ(θ) = G(θ)Z Σ 1 (θ){y X β(θ)}. (52) Since u = (u 1, u 2), the secon ientity leas to the BLUPs of u 1 an u 2, respectively given by ũ 1 (θ) = σ 2 1Ω 1 (ρ 1 )Z 1Σ 1 (θ){y X β(θ)}, ũ 2 (θ) = σ 2 2Ω 2 (ρ 2 )Σ 1 (θ){y X β(θ)}. Replacing an estimator ˆθ for θ in previous formulas we obtain ˆβ = β( ˆθ) an the EBLUPs of u 1 an u 2 respectively, û 1 = ũ 1 ( ˆθ) = (û 11,..., û 1D ) an û 2 = ũ 2 ( ˆθ) = (û 211,..., û 2DT ). Finally, the EBLUP of the area characteristic δ t uner the STFH moel (46) (49) returne by function eblupstfh is given by ˆδ t = x t ˆβ + û 1 + û 2t, = 1,..., D, t = 1,..., T. The following subsection escribes the REML moel fitting proceure applie by function eblupstfh to estimate θ an β. Remark 1. Computation of the inverse of the n n matrix Σ(θ) involve in (52) can be too time consuming for large n. This is replace by the inversion of two smaller matrices as follows. Observe that Σ(θ) can be expresse as Σ(θ) = σ 2 1Z 1 Ω 1 (ρ 1 )Z 1 + Γ(θ), 25

26 where Γ(θ) = iag 1 D {Γ (θ)} an Γ (θ) = σ 2 2Ω 2 (ρ 2 ) + iag 1 t T (ψ t ), = 1,..., D. Applying the inversion formula (A + CBD) 1 = A 1 A 1 C(B 1 + DA 1 C) 1 DA 1 (53) with A = Γ(θ), B = σ 2 1Ω 1 (ρ 1 ), C = Z 1 an D = Z 1, we obtain Σ 1 (θ) = Γ 1 (θ) Γ 1 (θ)z 1 { σ 2 1 Ω 1 1 (ρ 1 ) + Z 1Γ 1 (θ)z 1 } 1 Z 1 Γ 1 (θ), where Γ 1 (θ) = iag 1 D {Γ 1 (θ)}. Here, Γ (θ) is inverte using again (53). This proceure only requires inversion of the T T matrix Ω 2 (ρ 2 ) given in (51), which is constant for all, an the D D matrix Ω 1 (ρ 1 ) given in (50) REML fitting metho REML fitting metho maximizes the restricte likelihoo, which is the joint p..f. of a vector of n p linearly inepenent contrasts F y, where F is an n (n p) full column rank matrix satisfying F F = I n p an F X = 0 n p. It hols that F y is inepenent of ˆβ given in (52). Consequently, the p..f. of F y oes not epen on β an is given by { f R (θ; y) = (2π) (n p)/2 X X 1/2 Σ(θ) 1/2 X Σ 1 (θ)x 1/2 exp 1 } 2 y P(θ)y, where P(θ) = Σ 1 (θ) Σ 1 (θ)x { X Σ 1 (θ)x } 1 X Σ 1 (θ). Observe that P(θ) satisfies P(θ)Σ(θ)P(θ) = P(θ) an P(θ)X = 0 n. The REML estimator of θ = (θ 1,..., θ 4 ) = (σ 2 1, ρ 1, σ 2 2, ρ 2 ) is the maximizer of l R (θ; y) = log f R (θ; y). This maximum is compute using the Fisher-scoring algorithm. Let S R (θ) = l R (θ; y)/ θ = (S R 1 (θ),..., S R 4 (θ)) be the scores vector an I R (θ) = E{ 2 l R (θ; y)/ θ θ } = (I R rs(θ)) the Fisher information matrix associate with θ. Using the fact that P(θ) θ r = P(θ) Σ(θ) θ r P(θ), r = 1,..., 4, the first orer partial erivative of l R (θ; y) with respect to θ r is Sr R (θ) = 1 { 2 tr P(θ) Σ(θ) } + 1 θ r 2 y P(θ) Σ(θ) P(θ)y, r = 1,..., 4. θ r The element (r, s) of the Fisher information matrix is the expecte value of the negative secon orer partial erivative of l R (θ; y) with respect to θ r an θ s, which yiels Irs(θ) R = 1 { 2 tr P(θ) Σ(θ) P(θ) Σ(θ) }, r, s = 1,..., 4. θ r θ s 26

27 Then, if θ (k) is the value of the estimator at iteration k, the upating formula of the Fisher-scoring algorithm is given by θ (k+1) = θ (k) + I 1 R (θ(k) )S R (θ (k) ). Finally, the partial erivatives of Σ(θ) with respect to the components of θ, involve in S R (θ) an I R (θ), are given by Σ(θ) σ 2 1 Σ(θ) σ 2 2 = Z 1 Ω 1 (ρ 1 )Z 1, = iag {Ω 2 (ρ 2 )}, 1 D Σ(θ) ρ 1 Σ(θ) ρ 2 = σ1z 2 1 Ω 1 (ρ 1 ) Ω 1 1 (ρ 1 ) { Ω2 (ρ 2 ) = σ 2 2 iag 1 D ρ 2 Ω 1 (ρ 1 )Z ρ 1, 1 }, where an Ω 2 (ρ 2 ) ρ 2 = 1 1 ρ 2 2 Ω 1 1 (ρ 1 ) ρ 1 = W W + 2ρ 1 W W (T 1)ρ T (T 2)ρ T (T 2)ρ2 T (T 1)ρ T ρ 2Ω 2 (ρ 2 ). 1 ρ Function pbmsestfh Function pbmsestfh gives parametric bootstrap MSE estimates for the EBLUPs of the omain parameters uner the STFH moel (46) (49) as in Marhuena, Molina & Morales (2013). The call to the function is pbmsestfh(formula, D, T, varir, proxmat, B = 100, moel = "ST", MAXITER = 100, PRECISION = , ata) where the arguments are the same as in eblupstfh, together with the number of bootstrap replicates B. The parametric bootstrap proceure is escribe below: (1) Using the available ata {(ˆδ t DIR, x t ), t = 1,..., T, = 1,..., D}, fit the STFH moel (46) (49) an obtain moel parameter estimates ˆβ, ˆσ 1, 2 ˆρ 1, ˆσ 2 2 an ˆρ 2. (2) Generate bootstrap area effects {u (b) 1, = 1,..., D}, from the SAR(1) process given in (49), using (ˆσ 1, 2 ˆρ 1 ) as true values of parameters (σ1, 2 ρ 1 ). (3) Inepenently of {u (b) 1 } an inepenently for each, generate bootstrap time effects {u (b) 2t, t = 1,..., T }, from the AR(1) process given in (48), with (ˆσ 2, 2 ˆρ 2 ) acting as true values of parameters (σ2, 2 ρ 2 ). 27

28 (4) Calculate true bootstrap quantities, δ (b) t = x t ˆβ + u (b) 1 + u (b) 2t, t = 1,..., T, = 1,..., D. (5) Generate errors e (b) t sampling moel, in. N(0, ψ t ) an obtain bootstrap ata from the ˆδ DIR (b) t = δ (b) t + e (b) t, t = 1,..., T, = 1,..., D. (6) Using the new bootstrap ata {(ˆδ DIR (b) t, x t ), t = 1,..., T, = 1,..., D}, fit the STFH moel (46) (49) an obtain the bootstrap EBLUPs, ˆδ (b) t = x t ˆβ (b) + û (b) 1 + û (b) 2t, t = 1,..., T, = 1,..., D. (7) Repeat steps (1)-(6) for b = 1,..., B, where B is a large number. (8) The parametric bootstrap MSE estimates returne by function pbmsestfh are given by mse(ˆδ t ) = 1 B B b=1 (ˆδ (b) t ) 2 δ (b) t, t = 1,..., T, = 1,..., D. (54) 13 Function eblupbhf Function eblupbhf estimates the area means Ȳ, = 1,..., D, uner the unit level moel introuce by Battese, Harter & Fuller (1988) (BHF moel). The call to the function is eblupbhf(formula, om, selectom, meanxpop, popnsize, metho = "REML", ata) The function allows to select a subset of omains for estimation through the argument selectom, but ropping this argument it estimates in all omains. Let Y j be the value of the target variable for unit j in omain (left han sie of formula). The BHF moel assumes Y j = x jβ + u + e j, j = 1,..., N, = 1,..., D, u ii N(0, σ 2 u), e j ii N(0, σ 2 e), (55) where x j is a vector containing the values of p explanatory variables for the same unit (right han sie of formula), u is the area ranom effect an e j is the iniviual error, where area effects u an errors e j are inepenent. Let us efine vectors an matrices obtaine by stacking in columns the elements for omain y = col 1 j N (Y j ), X = col 1 j N (x j ), e = col 1 j N (e j ). 28

29 Then, the omain vectors y are inepenent an follow the moel y = X β + u 1 N + e, e in N(0, σ 2 ei N ), = 1,..., D, where u is inepenent of e. Uner this moel, the mean vector an the covariance matrix of y are given by µ = X β an V = σ 2 u1 N 1 N + σ 2 ei N. Consier the ecomposition of y into sample an out-of-sample elements y = (y r, y s ), an the corresponing ecomposition of X an V as ( ) ( ) Xs Vs V X =, V = sr. X r V rs If σu 2 an σe 2 are known, the BLUP of the small area mean Ȳ is given by ( Ȳ = 1 Y j + ) Ỹ j, (56) N j s j r V r where Ỹj = x β j + ũ is the BLUP of Y j. Here, β is the WLS estimator of β an ũ is the BLUP of u, given respectively by ( D ) 1 D β = X V 1 s X X V 1 s y, (57) =1 ũ = γ (ȳ s x β), s (58) where ȳ s = n 1 j s Y j, x s = n 1 j s x j = 1,..., D. an γ = σu/(σ 2 u 2 + σe/n 2 ), Let ˆσ u 2 an ˆσ e 2 be consistent estimators of σu 2 an σe 2 respectively, such as those obtaine by ML or REML. The EBLUP is ( ˆȲ = 1 Y j + ) Ŷ j, N j s j r (59) =1 where Ŷj = x j ˆβ + û is the EBLUP of Y j, ˆβ an û are given respectively by ˆβ = ( D =1 X ˆV 1 s X ) 1 D =1 X ˆV 1 s y (60) û = ˆγ (ȳ s x s ˆβ), (61) with ˆγ = ˆσ 2 u/(ˆσ 2 u + ˆσ 2 e/n ) an ˆV s = ˆσ 2 u1 n 1 n + ˆσ 2 ei n, = 1,..., D. Replacing (60) an (61) in (59), we obtain the expression for the EBLUP of Ȳ returne by function eblupbhf, ˆȲ = f ȳ s + ( X f x s ) ˆβ + (1 f )û, 29

30 where f = n /N is the sampling fraction. Note that the EBLUP requires the vector of population means of the auxiliary variables X (meanxpop) an the population sizes (popnsize) apart from the sample ata (specifie in formula), but the iniviual values of the auxiliary variables for each population unit are not neee. 14 Function pbmsebhf Function pbmsebhf gives a parametric bootstrap MSE estimate for the EBLUP uner the BHF moel (55). The call to the function is pbmsebhf(formula, om, selectom, meanxpop, popnsize, B = 200, metho = "REML", ata) The function applies the parametric bootstrap proceure for finite populations introuce by González-Manteiga et al. (2008a) particularize to the estimation of means. The estimate MSEs are obtaine as follows: 1) Fit the BHF moel (55) to sample ata y s = (y 1s,..., y Ds ) an obtain moel parameter estimates ˆβ, ˆσ 2 u an ˆσ 2 e. 2) Generate bootstrap omain effects as u (b) ii N(0, ˆσ 2 u), = 1,..., D. 3) Generate, inepenently of the ranom effects u (b), bootstrap errors for sample elements e (b) ii j N(0, ˆσ e), 2 j s, an error omain means Ē (b) ii N(0, ˆσ e/n 2 ), = 1,..., D., 4) Compute the true omain means of this bootstrap population, given by Ȳ (b) = X ˆβ + u (b) + Ē (b), = 1,..., D. Observe that computation of Ȳ (b) oes not require the iniviual values x j, for each out-of-sample unit j r. 5) Using the known sample vectors x j, j s, generate the moel responses for sample elements from the moel Let y (b) s Y (b) j = x j ˆβ + u (b) + e (b) j, j s, = 1,..., D. = ((y (b) 1s ),..., (y (b) Ds ) ) be the bootstrap sample ata vector. 6) Fit the BHF moel (55) to bootstrap ata ys (b) (b) EBLUPs ˆȲ, = 1,..., D. an obtain the bootstrap 30

A comparison of small area estimators of counts aligned with direct higher level estimates

A comparison of small area estimators of counts aligne with irect higher level estimates Giorgio E. Montanari, M. Giovanna Ranalli an Cecilia Vicarelli Abstract Inirect estimators for small areas use auxiliary