Bas-correcton under a sem-parametrc model for small area estmaton Laura Dumtrescu, Vctora Unversty of Wellngton jont work wth J. N. K. Rao, Carleton Unversty ICORS 2017 Workshop on Robust Inference for Sample Surveys, Wollongong, July 7th, 2017 Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 1
Table of contents 1 Small area estmaton 2 Robust methods 3 Sem-parametrc mxture model 4 Smulaton study Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 2
Small area estmaton Area or Doman s a geographcal area: county/provnce/an admnstratve area/ soco-demographc area Desgn-based approach uses doman-specfc drect estmators for domans wth large enough sample szes Ŷ = k s w k (a k y k ) a k s the doman ndcator varable w k are the desgn weghts Drect doman estmators are not relable for small domans Indrect estmators are employed, typcally based on lnear models Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 3
Framework Populaton of nterest s dvded nto m areas, U, = 1,..., m Let s = s U, = 1,..., m Varable y s observed n area Auxlary vector x n area s known at populaton level Scope: predct Ȳ Models employed at area level, or unt level Focus on mxed models, whch nclude the area random effect All non-sampled values follow exactly the assumed workng model Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 4
Mxed models Area level (Fay-Herrot, 1979) relate small area drect estmators to area-specfc covarates ȳ = x t β + v + e Unt level (Battese, Harter and Fuller, 1988) s a nested error regresson model y j = x t jβ + v + e j P-splne model (Opsomer, Claeskens, Ranall, Kauemann and Bredt, 2008) avods parametrc specfcaton to the mean functon y j = m K (x j ) + v + e j Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 5
Semparametrc model The penalzed splne approach allows the use of mxed model theory m K (x j ; β, u) = β 0 + β 1 x j +... + β px p j + := x t jβ + w t ju Mnmze penalzed sum of squares mn β,u =1 Choce of λ Cross-Valdaton Mxed model approach K u k (x j q k ) p + k=1 n K (y j m K (x j ; β, u)) 2 + λ k=1 u 2 k Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 6
Outlers Defnton Outlers are values whch devate from the pattern set by the majorty of the data. Outlyng observatons can occur due to measurement errors or generated by heavy-taled dstrbutons. Part of the data not fttng the same model: mxture models. Manly focus on dstrbutonal robustness. General form of a mxture dstrbuton F = (1 ε)g + εh Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 7
Estmaton Goals 1 optmal or nearly optmal effcency when the model s correct 2 small devatons from the model assumptons should only slghtly affect ts performance Robust estmators M-estmator generalzes the maxmum lkelhood estmator; solved by numercal methods ψ(x θ) = 0 L-estmator lnear combnaton of a functon of order statstcs ˆθ = α n f (X () ) R-estmator obtaned by nvertng a rank test Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 8
Robust methods n small area estmaton Chambers, Chandra, Salvat and Tzavds (2014): robust projectve and robust predctve approaches Robust Plug-In methods are projectve All non-sampled values are not outlers Approach s projectve because the workng model s projected onto the whole non-sampled part of the populaton Examples M-Quantle methods (Chambers and Tzavds, 2006) Robust EBLUP (Snha and Rao, 2009) Semparametrc Robust EBLUP (Rao, Snha and Dumtrescu, 2014) Bas-corrected robust methods are robust predctve Approach s predctve because the sample outler nformaton s used to predct contamnaton on the varable of nterest Some non-sampled unts are outlers Examples Local bas-correcton (Chambers, Chandra, Salvat and Tzavds, 2014) Full bas-correcton (Dongmo-Jongo, Hazza and Duchesne, 2013) Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 9
Robust mxed model estmaton Lnear mxed model y = Xβ + Wu + e, If varance components are known ˆβ = (X t V 1 X) 1 X t V 1 y û = Σ uw t V 1 (y Xˆβ) Robust estmators and predctors X T Σ 1/2 ε Ψ[Σ 1/2 ε (y Xˆβ Wû)] = 0 Z T Σ 1/2 ε Ψ[Σ 1/2 ε (y Xˆβ Wû)] Σ 1/2 u Ψ(Σ 1/2û) u = 0 Ψ[Σ 1/2 ε (y Xˆβ Wû)] T 1/2 V Σ ε Σ 1/2 ε Ψ[Σ 1/2 ε (y Xˆβ Wû)] θ k ( ) tr V 1 V = 0. θ k Ψ(s) = (ψ b (s 1 ), ψ b (s 2 ),..., ) t, ψ b (s) = s mn(1, b/ s ), b > 0 Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 10
Robust predctor Approxmatng model y = Xβ + Wu + Zv + e The (R)EBLUP of Ȳ s taken as X t ˆβ + Wt û + 1 ˆv If the samplng fracton n /N s not neglgble the best lnear unbased predctor of Ȳ ˆµ = 1 y j + N j s j s ŷ j, where ŷ j = x t j ˆβ + w t jû + ˆv Robust methods are known to perform well f the dstrbuton s symmetrc, but may nvolve a large bas otherwse The case of a mxture between two semparametrc models wth dfferent means leads to a larger bas when fxed parameters and random effects are estmated/predcted usng robust ML or robust MME methods Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 11
Mxture model Nonparametrc model y j = m(x j ) + v + ε j possble outlers n ε, v Mxture model ζ m : y j = (1 A j )y 0j + A j y 1j, A j Bernoull(p) Bas correcton methods ζ 0 : y 0j = m 0 (x j ) + v 0 + ε 0j ζ 1 : y 1j = m 1 (x j ) + v 1 + ε 1j Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 12
Bas correcton I Due to Chambers (1986) ˆµ EBLUP = (ω () s ) t y s (ω () s ) t x s = j U x t j, ˆµ EBLUP = ˆµ REBLUP + correcton terms The weghts ω () hj = { 1 + 1 t N n M () j, h =, j s 1 t N n M () hj, h, j s h. M () = N () [I X(X t V 1 X) 1 X t V 1 ] + Ẋ(X t V 1 X) 1 X t V 1 N () = (σ 2 uẇw t + σ 2 v ŻZ t )V 1, where W () h = { j s ω () j j s h ω () hj, N, h = h Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 13
Then, under the P-splne model ˆµ robust = ˆµ REBLUP m + N 1 + N 1 + N 1 ( h=1 m h=1 j s h Ψ c1 [ω () jh (y hj x t hj ˆβ R w t hjûr ˆv h,r )] Ψ c2 (W () ˆv h h,r) m h=1 Calbraton does not hold for w Choce of tunnng constants j s h ω () jh wt hj j U w t j)û R c 1 = k medan jh ( ω () jh )ˆσ er, c 2 = k medan h ( W () h )ˆσ vr Alternates between EBLUP and the robust predctor Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 14
Bas correcton II Beaumont, Hazza and Ruz-Gazen (2013), Dongmo Jongo, Hazza and Duchesne (2013) Defne the condtonal bas as a measure of nfluence of unt j n area h for predctng the mean area B hj (y hj, v h, u) := E(ˆµ BLUP Ȳ y hj, v h, u). Wth T := N 1 ( m l=1 r s l ω () lr w T lr r U w T r )u, r hj = y hj x T hjβ w T hju v h the condtonal bas s N 1 ω () r hj hj + N 1 W () h v h + T, h, j s h N 1 W () B hj (y hj, v h, u) = h v h + T, h, j U h s h N 1 (ω () 1)r j j + N 1 W () v + T, h =, j s N 1 r j + N 1 W () v + T, h =, j U s Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 15
ψ d (ˆB hj ) = ˆµ robust = ˆµ EBLUP m ˆBhj + h=1 j s h h=1 { N 1 ψ d (ˆω () ˆr hj hj) + N 1 ψ d [( ω ˆ() N 1 j Ŵ () h m j s h ψ d (ˆB hj ) ψ d(ˆv h ) + ˆT, h, j s h 1)r j ] + N 1 Ŵ () ψ d (ˆv ) + ˆT, h =, j s, where fxed effects and random components are estmated by an robust ML method. Consder the class of robust predctors ˆµ R (c) = ˆµ EBLUP + (c) Wthn ths class, we search the value of c whch mnmzes the maxmum absolute estmated condtonal bas of ˆµ R (c) Then ˆB mn ˆµ R (c) = ˆµ EBLUP where = mn h,j sh {ˆB hj (y hj, ˆv h, û)} and = max h,j sh {ˆB hj (y hj, ˆv h, û)} ˆB max 1 mn max (ˆB + ˆB ), 2 Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 16
Setup True sem-parametrc nested error model y j = m(x j ) + v + e j, = 1,..., 40, j = 1,..., 40 Random effects v and e j are generated from contamnated normal dstrbutons v d (1 γ 1 )N(0, σ 2 v ) + γ 1 N(0, σ 2 v1), e j d (1 γ 2 )N(0, σ 2 e) + γ 2 N(0, σ 2 e1), where σ 2 v = σ 2 e = 1 and σ 2 v1 = σ 2 e1 = 25 Proportons of outlers: γ = γ 1 = γ 2 = 0.1 SRS from each area wth n = 4 Mxture of the two means m(x) = (1 γ)m 0 (x) + γm 1 (x) lnear quadratc k 1 = 1.345 and k 2 = 9 Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 17
Smulated absolute bases and mean squared predcton errors True Conta- K = 0 K = 20 K = 30 model mnaton Method Bas MSPE Bas MSPE Bas MSPE Lnear (0, 0) EBLUP 0.0193 0.1878 0.0127 0.1915 0.0136 0.1886 REBLUP 0.0193 0.1940 0.0120 0.1968 0.0142 0.1936 (e, 0) EBLUP 0.0246 0.4732 0.0273 0.4682 0.0242 0.4707 REBLUP 0.0208 0.3168 0.0192 0.3118 0.0210 0.3165 (0, v) EBLUP 0.0142 0.2115 0.0175 0.2138 0.0148 0.2141 REBLUP 0.0140 0.2044 0.0173 0.2076 0.0148 0.2079 (e, v) EBLUP 0.0256 0.6201 0.0245 0.6310 0.0286 0.6220 REBLUP 0.0203 0.3447 0.0206 0.3433 0.0226 0.3430 Quadratc (0, 0) EBLUP 0.0543 0.3997 0.0174 0.1921 0.0166 0.1969 REBLUP 0.1017 0.3756 0.0179 0.1976 0.0164 0.2022 (e, 0) EBLUP 0.0643 0.6262 0.0287 0.4792 0.0288 0.4756 REBLUP 0.1030 0.5251 0.0235 0.3201 0.0212 0.3210 (0, v) EBLUP 0.0521 0.5556 0.0156 0.2183 0.0198 0.2202 REBLUP 0.1156 0.4535 0.0143 0.2112 0.0200 0.2130 (e, v) EBLUP 0.0681 0.9451 0.0260 0.6250 0.0283 0.6404 REBLUP 0.1336 0.6631 0.0217 0.3560 0.0220 0.3529 Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 18
Mxture of two lnear models m 0 (x) = 100 + 3x, m 1 (x) = 150 + x Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 19
Smulated absolute bases and mean squared predcton errors Models: m 0 (x) = 100 + 3x and m 1 (x) = 150 + x Mxture Method Bas MSPE Bas MSPE (K = 0) (K = 20) (0, 0, b) EBLUP 0.0937 7.4464 0.0882 7.4957 REBLUP 4.0219 21.8341 3.7152 23.8805 BCI (k 1 ) 3.4322 18.0372 3.1751 20.8888 BCI (k 2 ) 1.5730 10.023 1.2699 13.9493 BCII 0.1981 7.2601 0.1208 8.9915 (e, v, 0) EBLUP 0.0217 0.6168 0.0217 0.6197 REBLUP 0.0180 0.3421 0.0178 0.3440 BCI (k 1 ) 0.0205 0.3652 0.0204 0.3697 BCI (k 2 ) 0.0233 0.5828 0.0230 0.5893 BCII 0.0205 0.4578 0.0201 0.4620 (e, v, b) EBLUP 0.0886 9.4479 0.0858 9.49290 REBLUP 3.9278 22.2256 3.7386 24.8069 BCI (k 1 ) 3.1015 17.4217 2.9441 20.6463 BCI (k 2 ) 1.1727 10.2148 1.0513 13.7787 BCII 0.1973 8.7437 0.1502 10.0789 Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 20
Mxture of two quadratcs m 0 (x) = 1 + x + x 2 m 1 (x) = 2 x 3x 2 Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 21
Smulated absolute bases and mean squared predcton errors Models: m 0 (x) = 1 + x + x 2 and m 1 (x) = 2 x 3x 2, 20 knots Mxture Method Bas MSPE (0, 0, b) EBLUP 0.0538 1.4086 REBLUP 0.6221 1.2053 BCI (k 1 ) 0.3849 0.9991 BCI (k 2 ) 0.1472 1.0375 BCII 0.0781 1.1184 (e, v, 0) EBLUP 0.0236 0.6305 REBLUP 0.0209 0.3553 BCI (k 1 ) 0.0211 0.3860 BCI (k 2 ) 0.0233 0.6004 BCII 0.0212 0.4765 (e, v, b) EBLUP 0.0474 2.2603 REBLUP 0.5844 1.4148 BCI (k 1 ) 0.3261 1.2446 BCI (k 2 ) 0.1455 1.5596 BCII 0.1040 1.6208 Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 22
Smulated absolute bases and mean squared predcton errors Models: m 0 (x) = 1 + x and m 1 (x) = 1 + x + 4x 2 Mxture Method Bas MSPE (0, 0, b) EBLUP 0.0409 1.0841 REBLUP 0.5136 0.9073 BCI (k = 3) 0.2037 0.9991 BCII 0.0771 0.8640 (e, v, 0) EBLUP 0.0328 0.6179 REBLUP 0.0242 0.3422 BCI (k = 3) 0.0248 0.4384 BCII 0.0275 0.4569 (e, v, b) EBLUP 0.0508 2.1058 REBLUP 0.5420 1.2334 BCI (k = 3) 0.1897 1.1625 BCII 0.0939 1.4690 Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 23
Further research nterests Inference- bootstrap MSPE estmaton; analytc MSPE approxmatons Generalzed Lnear Mxed Model Informatve samplng Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 24
Thank You! Laura Dumtrescu Bas-correcton under a sem-parametrc model for small area estmaton 25