Outlier Robust Small Area Estimation

Size: px
Start display at page:

Download "Outlier Robust Small Area Estimation"

Transcription

1 Unversty of Wollongong Research Onlne Centre for Statstcal & Survey Methodology Workng Paper Seres Faculty of Engneerng and Informaton Scences 009 Outler Robust Small Area Estmaton R. Chambers Unversty of Wollongong, Hukum Chandra Unversty of Wollongong, N. Salvat Unversty of Psa N. zavds Unversty of Manchester Recommended Ctaton Chambers, R.; Chandra, Hukum; Salvat, N.; and zavds, N., Outler Robust Small Area Estmaton, Centre for Statstcal and Survey Methodology, Unversty of Wollongong, Workng Paper 16-09, 009, 40p. Research Onlne s the open access nsttutonal repostory for the Unversty of Wollongong. For further nformaton contact the UOW Lbrary: research-pubs@uow.edu.au

2 Centre for Statstcal and Survey Methodology he Unversty of Wollongong Workng Paper Outler Robust Small Area Estmaton R Chambers, H Chandra, N Salvat and N zavds Copyrght 008 by the Centre for Statstcal & Survey Methodology, UOW. Work n progress, no part of ths paper may be reproduced wthout permsson from the Centre. Centre for Statstcal & Survey Methodology, Unversty of Wollongong, Wollongong NSW 5. Phone , Fax Emal: anca@uow.edu.au

3 Outler Robust Small Area Estmaton R. Chambers 1, H. Chandra, N. Salvat 3 and N. zavds 4 Abstract: Outlers are a well-known problem n survey estmaton, and a varety of approaches have been suggested for dealng wth them n ths context. However, when the focus s on small area estmaton usng the survey data, much less s known even though outlers wthn a small area sample are clearly much more nfluental than they are n the larger overall sample. o the best of our knowledge, Chambers and zavds (006) was the frst publshed paper n small area estmaton that explctly addressed the ssue of outler robustness, usng an approach based on fttng outler robust M-quantle models to the survey data. Recently, Snha and Rao (009) have also addressed ths ssue from the perspectve of lnear mxed models. Both these approaches, however, use plug-n robust predcton. hat s, they replace parameter estmates n optmal, but outler senstve, predctors by outler robust versons. Unfortunately, ths approach may nvolve an unacceptable predcton bas (but a low predcton varance) n stuatons where the outlers are drawn from a dstrbuton that has a dfferent mean to the rest of the survey data (Chambers, 1986), whch then leads to the suggeston that outler robust predcton should nclude an addtonal term that makes a correcton for ths bas. In ths paper, we explore the extenson of ths dea to the small area estmaton stuaton and we propose two dfferent analytcal mean squared error (MSE) estmators for outler robust predctors of small area means. We use smulaton based on realstc outler contamnated data to evaluate how the extended small area estmaton approach compares wth the plug-n robust methods descrbed earler. he emprcal results show that the bascorrected predctve estmators are less based than the proectve estmators especally when there are outlers n the area effects. Moreover, n the smulaton experments we contrast the proposed MSE estmators wth those generally utlzed for the plug-n robust predctors. he proposed bas-robust and lnearzaton-based MSE estmators appear to perform well when used wth the robust predctors of small area means that are consdered n ths paper. Key words and phrases: Lnear mxed model; M-quantle model; M-estmaton. Robust predcton; Basvarance trade-off; EBLUP; Robust bas correcton. Unversty of Wollongong Wollongong 5, New South Wales, Australa. E-mal: ray@uow.edu.au Indan Agrcultural Statstcs Research Insttute. E-mal: hchandra@asr.res.n Unversty of Psa. E-mal: salvat@ec.unp.t Unversty of Manchester. E-mal: nkos.tzavds@manchester.ac.uk 1

4 1. Introducton Outlers are a fact of lfe for any survey, and especally so for busness surveys. As a consequence, a varety of methods have been devsed to mtgate the effects of outler values on survey estmates. Some of these, lke dentfcaton and removal of these data values by experenced data experts durng survey processng, can be effectve n ensurng that the resultng survey estmates are unaffected by them. However, beng somewhat subectve, such methods are not amenable to scentfc evaluaton. As a consequence there are a number of obectve methods for survey estmaton that use statstcal rules to decde whether an observaton s a potental outler and to down-weght ts contrbuton to the survey estmates f ths s the case. Generally, an outler robust estmator of ths type s based on the assumpton that the non-outler sample values all follow a well-behaved workng model and so t generally nvolves predcton of the sum (or mean) of these values under ths workng model. In practce, ths often nvolves replacement of an outlyng sample value by an estmate of what t should have been f n fact t had been generated under the workng model. We refer to such methods as Robust Proectve n what follows snce they proect sample non-outler behavour on to the non-sampled part of the survey populaton. Robust Proectve methods essentally emulate the subectve approach descrbed earler, and typcally lead to based estmators wth lower varances than would otherwse be the case. he reason for the bas s not dffcult to fnd t s extremely unlkely that the non-sampled values n the target populaton are drawn from a dstrbuton wth the same mean as the sample non-outlers, and yet these methods are bult on precsely ths assumpton. Chambers (1986) recognsed ths dlemma and coned the concept of a representatve outler,.e. a sample outler that s potentally drawn from a group of populaton outlers and hence cannot be untweghted n estmaton. He noted that representatve outlers cannot be treated on the same bass n estmaton as other sample data more consstent wth the workng model for the populaton, snce such values can serously destablse the survey estmates, and suggested addton of an outler robust bas correcton term to a Robust Proectve survey estmator, e.g. one based on outler-robust estmates of the model parameters. Welsh and Ronchett (1998) expand on ths dea, applyng t more generally to estmaton of the fnte populaton dstrbuton of a survey varable n the presence of representatve outlers. A smlar dea s mplct n the approach descrbed n Chambers et al. (1993), where a nonparametrc bas correcton s suggested. In what

5 follows, we refer to methods that allow for contrbutons from representatve sample outlers as Robust Predctve snce they attempt to predct the contrbuton of the populaton outlers to the populaton quantty of nterest. If outlers are a concern for estmaton of populaton quanttes, t s safe to say that they are even more of a concern n small area estmaton, where sample szes are consderably smaller and model-dependent estmaton s the norm. It s easy to see that an outler that destablses a populaton estmate based on a large survey sample wll almost certanly destroy the valdty of the correspondng drect estmate for the small area from whch the outler s sourced snce ths estmate wll be based on a much smaller sample. hs problem does not dsappear when the small area estmator s an ndrect one, e.g. an Emprcal Best Lnear Unbased Predctor (EBLUP), snce the weghts underpnnng ths estmator wll stll put most emphass on data from the small area of nterest, and the estmates of the model parameters underpnnng the estmator wll themselves be destablsed by the sample outlers. Consequently, t s of some nterest to see how outler robust survey estmaton can be adapted to ths stuaton. Chambers and zavds (006) explctly address ths ssue of outler robustness, usng an approach based on fttng outler robust M-quantle models to the survey data. Recently, Snha and Rao (009) have also addressed ths ssue from the perspectve of lnear mxed models. Both these approaches, however, use plug-n robust predcton. hat s, they replace parameter estmates n optmal, but outler senstve, predctors by outler robust versons (a Robust Proectve approach). Unfortunately, ths approach may nvolve an unacceptable predcton bas (but a low predcton varance) n stuatons where the outlers are drawn from a dstrbuton that has a dfferent mean to the rest of the survey data. After dscussng Robust Proectve estmators for small areas n Secton, we explore the extenson of Chambers (1986) Robust Predctve approach to the small area estmaton stuaton n Secton 3. In Secton 4 we propose two dfferent analytcal mean squared error (MSE) estmators for outler robust predctors of small area means. In partcular, the frst proposal s based on bas-robust mean squared error estmaton dscussed by Chambers et al. (007) and represents an extenson of the deas n Royall and Cumberland (1978). We show how ths approach can be useful for estmatng the MSE of small area predctors based on the Snha and Rao (009) approach. he second MSE estmator s developed under the condtonal verson of the lnear mxed 3

6 model and t uses the frst order approxmatons to the varances of solutons of estmatng equatons. hs last approach can be used for estmatng the MSE of a wde varety of small area 'pseudo-lnear' predctors,.e. predctors that can be wrtten as weghted sums, where the weghts are sample data dependent. Examples of such predctors are mxed model and M-quantle model-based predctors under both the Robust Proectve and the Robust Predctve approaches. In Sectons 5 and 6 we use model-based smulatons based on realstc outler contamnated data scenaros as well as desgn-based smulatons to evaluate how these two dfferent approaches compare, both n terms of estmaton performance as well as n terms of MSE estmaton performance. Secton 7 concludes the paper wth some fnal remarks, and a dscusson of future research amed at outler robust small area nference.. Robust Proectve Estmaton for Small Areas In what follows we assume that unt record data are avalable at small area level. For the sampled unts n the populaton ths conssts of ndcators of small area afflaton, values y of the varable of nterest, values x of a p 1 vector of ndvdual level covarates, and values z of a vector of area level covarates. For the nonsampled populaton unts we do not know the values of y. However t s assumed that all areas are sampled and that we know the numbers of such unts n each small area and the respectve small area averages of x and z. We also assume that there s a lnear relatonshp between y and x and that samplng s nonnformatve for the small area dstrbuton of y gven x, allowng us to use populaton level models wth the sample data. A popular way of usng the above data n small area estmaton s to assume a lnear mxed model, wth random effects for the small areas of nterest (see Rao, 003). Let y, X and Z denote the populaton level vector and matrces defned by y, x and z respectvely. hen y = X + Zu + e (1) where u N(0, u ) s a vector of mq area-specfc random effects and e N(0, e ) s a vector of N 4

7 ndvdual specfc random effects. Here m s the number of small areas that make up the populaton and q s the dmenson of z. It s assumed that the covarance matrces u and e are defned n terms of a lower dmensonal set of parameters = ( 1,, K ), whch are typcally referred to as the varance components of (1), whle s usually referred to as ts fxed effect. Let ˆ and û denote estmates of the fxed and random effects n (1). he EBLUP of the area mean of the y under (1) s then ŷ EBLUP 1 = N { n y s + ( N n )( x ˆ + zrû r )} () where û denotes the vector of the estmated area specfc random effects and we use ndces of s and r to denote sample and non-sample quanttes respectvely. hus y s s the average of the n sample values of y from area and x r and z r denotng the vectors of average values of x and z respectvely for the N n non-sampled unts n the same area. From a Robust Proectve vewpont, () can be made nsenstve to sample outlers by replacng ˆ and û by outler robust alternatves. o motvate ths approach, we ntally assume the varance components are known, so the covarance matrces u and e n (1) are known. Put V s = es + Z s u Z s where es denotes the sample component of e. hen the BLUE of the fxed effect vector s whle the BLUP of the random effects vector u s = { X s V 1 s X s } 1 X s V 1 s y s, (3) u = u Z s V 1 s y s X ( s ). (4) It s easy to see that (3) and (4) are solutons to X s V 1 s ( y s X s )= 0 (5) and u Z s V 1 s ( y s X s ) u = 0. (6) A straghtforward way to make the solutons to (5) and (6) robust to sample outlers s therefore to replace 5

8 them by X s V 1/ 1/ s ( V s { y s X s } )= 0 (7) and u Z s V 1/ 1/ s ( V s { y s X s } ) 1/ u ( 1/ u u)= 0. (8) Here s a bounded nfluence functon and (a) denotes the vector defned by applyng to every component of a. Unfortunately, snce V s s not a dagonal matrx, the soluton to (8) can be numercally unstable. An alternatve approach was therefore suggested by Fellner (1986), who noted that any soluton to (5) and (6) was also a soluton to and X s 1 es ( y s X s Z s u)= 0 Z s 1 es ( y s X s Z s u) 1 u u = 0. He suggested that these alternatve estmatng equatons (and hence ther solutons) be made outler robust by replacng them by and X s 1/ 1/ es ( es { y s X s Z s u} )= 0 (9) Z s 1/ 1/ es ( es { y s X s Z s u} ) 1/ u ( 1/ u u)= 0. (10) Snce (9) and (10) assume the varance components are known, ther usefulness s somewhat lmted unless outler robust estmators of these parameters can also be defned. hs s an ssue nvestgated by Rchardson and Welsh (1995). hese authors propose two outler robust varatons to the maxmum lkelhood estmatng equatons for. One of these (ML Proposal II) leads to an estmatng equaton for the varance component k of of the form ( y s X s ) 1/ { V s } Vs1/ V s k ( )V 1/ 1/ s { V s ( y s X s )}= tr D n V s k { ( )} (11) where V s k denotes the frst order partal dervatve of V s wth respect to the varance component k 6

9 and, for Z N(0,1), D n = E { (Z)}V 1 s. (1) Snha and Rao (009) descrbe an approach to outler robust estmaton of and u n (1) that bulds on these results, substtutng approxmate solutons to both (7) and (11) nto the Fellner estmatng equaton (10) to obtan an outler robust estmate of the area effect u. In partcular, ther approach replaces (7) by X s V 1 s U 1/ 1/ s ( U s { y s X s } )= 0 (13) where U s = dag ( V s ), and replaces (1) by ( y s X s ) 1/ { U s } U s 1/ V 1 s ( V s k )V 1 s U 1/ 1/ s { U s ( y s X s )}= tr { D n ( V s k )}. (14) Snce the solutons to (13) and (14) depend on the nfluence functon, we denote them by a superscrpt of below. he Snha and Rao (009) Robust Proectve alternatve to () s then ŷ SR = x ˆ + z û. (15) Note that (15) estmates the area mean under (1). A mnor modfcaton restrcts ths to the mean of the non-sampled unts n area, n whch case (15) becomes ŷ REBLUP 1 = N { n y s + ( N n )( x r ˆ + z )} rû. (16) Hereafter we call ths estmator Robust EBLUP (REBLUP). An alternatve methodology for outler robust small area estmaton s the M-quantle regresson-based method descrbed by Chambers and zavds (006). hs s based on a lnear model for the M-quantle regresson of y on X,.e. m q (X) = X q (17) where m q (X) denotes the M-quantle of order q of the condtonal dstrbuton of y gven X. An estmate ˆ q of q can be calculated for any value of q n the nterval (0,1), and for each unt n sample we defne ts unque M-quantle coeffcent under ths ftted model as the value q such that y = x ˆq, wth the sample average of these coeffcents n area denoted by q. he M-quantle estmate of the mean of y n 7

10 area s then ŷ MQ 1 = N n y s + ( N n )x ˆq { r }. (18) Note that the regresson M-quantle (17) model depends on the nfluence functon underpnnng the M- quantle. When ths functon s bounded, sample outlers have lmted mpact on ˆ q. hat s, (18) corresponds to assumng that all non-sample unts n area follow the workng model (17) wth q = q, n the sense that one can wrte y = x q + nose for all such unts. 3. Robust Predctve Estmaton for Small Areas A problem wth the Robust Proectve approach s that t assumes all non-sampled unts follow the workng model, or, n what essentally amounts to the same thng, that any devatons from ths model are nose and so cancel out on average. hus, under the lnear mxed model (1) one can see that provded the ndvdual errors of the non-sampled unts are symmetrcally dstrbuted about zero, the REBLUP (16) of Snha and Rao (009) wll perform well snce t s based on the mplct assumpton that the average of these errors over the non-sampled unts n area converges to zero. he M-quantle estmator MQ (18) s no dfferent snce t assumes that the errors y x q from the area -specfc M-quantle regresson model are nose and hence also cancel out on average. Note that ths does not mean that these non-sample unts are not outlers. It s ust that ther behavour s such that our best predcton of ther correspondng average value s zero. Welsh and Ronchett (1998) consder the ssue of outler robust predcton wthn the context of populaton level survey estmaton. Startng wth a workng lnear model lnkng the populaton values of y and x, and sample data contanng representatve outlers wth respect to ths model, they extend the approach of Chambers (1986) to robust predcton of the emprcal dstrbuton functon of the populaton values of y. her argument mmedately apples to robust predcton of the emprcal dstrbuton functon of the area values of y, and leads to a predctor of the form ˆF 1 (t) = N s 1 I( y t) + n s kr ( { } ) t I x k ˆ + ( y x ˆ ). (19) 8

11 Here ˆ denotes an M-estmator of the regresson parameter n the lnear workng model based on a bounded nfluence functon, s a robust estmator of the scale of the resdual y x ˆ n area and denotes a bounded nfluence functon that satsfes. zavds et al. (009) note that the robust estmator of the area mean of the y defned by (19) s ust the expected value functonal defned by t, whch s ŷ = td ˆF (t) 1 = N n y s + N n ( ) x r ˆ + n 1 s {( y x ˆ ) }. (0) hese authors therefore suggest an extenson to the M-quantle estmator (18) by replacng ˆ n (0) by ˆ q, whch leads to a bas-corrected verson of (18), hereafter MQ-BC, gven by ŷ MQBC 1 = N n y s + N n ( ) x r ˆq + n 1 s MQ MQ y x ˆq {( ) } (1) and MQ s a robust estmator of the scale of the resdual y x ˆq n area. he use of the two nfluence functons and n (1) s worthy of comment. he frst,, underpns ˆq, and hence ˆq. Its purpose s to ensure that sample outlers have lttle or no nfluence on the ft of the workng M-quantle model. As a consequence t s bounded and down-weghts these outlers. he second,, s stll bounded but less restrctve than (snce ) and ts purpose s to defne an adustment for the bas caused by the fact that the frst two terms on the rght hand sde of (1) treat sample outlers as self-representng. A smlar argument can be used to modfy the REBLUP (16). In partcular, a Robust Predctve verson of ths estmator, hereafter REBLUP-BC, mmcs the bas correcton dea used n (1) and leads to {( ) } ŷ REBLUPBC = ŷ REBLUP ( 1 n N )n y x ˆ z û, () s where the are now robust estmates of the scale of the area resduals y x ˆ z û. 9

12 4. MSE Estmaton for Robust Predctors In ths Secton we propose two dfferent MSE estmators for robust predctors of small area means under the Robust Proectve and Robust Predctve approaches. In Secton 4.1 we apply the deas set out by Chambers et al. (007) to develop a pseudo-lnearzaton estmator of the MSE of REBLUP and REBLUP-BC. In Secton 4. we use frst order approxmatons to the varances of solutons of estmatng equatons to develop MSE estmators, under the condtonal verson of the lnear mxed model, for the REBLUP, EBLUP and MQ predctors for small area means. 4.1 Bas-robust MSE estmaton for REBLUP and REBLUP-BC Snha and Rao (009) proposed a computatonally ntensve parametrc bootstrap-based estmator for the MSE of REBLUP. An alternatve MSE s the one that condtons on the realsed values of the area effects (see Longford, 007). In what follows we propose an estmator of the condtonal MSE of the REBLUP and REBLUP-BC that s much less computatonally demandng than the uncondtonal MSE estmators suggested by Snha and Rao (009). he proposed estmator s based on the pseudo-lnearzaton approach to MSE estmaton descrbed by Chambers et al. (007). See also Chandra and Chambers (005, 009) and Chandra et al. (007). he MSE estmator can be used for predctors that can be expressed as weghted sums of the sample values. For ths reason re-express REBLUP (16) and REBLUP-BC () n a pseudo-lnear form, and then apply heteroskedastcty-robust predcton varance estmaton methods that treat these weghts (whch typcally depend on estmated varance components) as fxed. More precsely, under model (1) the Robust BLUP of y can be expressed as where ŷ RBLUP = w RBLUP y s RBLUP ( w s ) = 1 1 N s + N n RBLUP = ( w s ) y s, 1 m (3) { ( ) x r A s + z r B s ( I s X s A s ) }. Here A s = ( X s V 1 s U 1/ s W 1s U 1/ s X s ) 1 X s V 1 s U 1/ s W 1s U 1/ s, where W 1s s a n n dagonal matrx of weghts 10

13 ( ) U 1/ 1/ wth -th component w 1 = U y x { } y x { }; ( ) 1 Z s es B s = Z s 1/ es W s 1/ es Z s + 1/ 1/ u W 3s u weghts wth -th component w = 1/ 1/ ( W s es ), where W s s a n n dagonal matrx of { } ( e ) 1 y x z u ( e ) 1 y x { z u }; and W 3s s a m m dagonal matrx of weghts wth -th component w 3 = u ( ) 1 u ( u ) 1 u. he Appendx provdes detals on the computaton of such weghts. Note that the REBLUP (16) can be expressed n exactly the same way, except that all quanttes n the vector w s RBLUP that depend on (unknown) varance components now need a hat. Gven ths pseudo-lnear representaton for the REBLUP, we develop a smple frst order approxmaton to ts MSE assumng the condtonal verson of the model (1),.e. the random effects are consdered as fxed. In ths case we can apply the approach descrbed by Royall and Cumberland (1978) to estmate the predcton varance of the RBLUP for y. Let I( ) denote the ndcator for whether unt s n area. hen Var ŷrblup y X,u ( ))= N { s + Var y x,u r ( N w RBLUP I( ) ) Var y x,u ( ) ( ) }, (4) where the frst term on the rght hand sde above s estmated replacng Var ( y x,u ) by 1 ( y ˆμ ), where ˆμ = and = 1 + k y ks k s an unbased lnear estmator of the condtonal expected value μ = E( y x,u ) { ks k } s a scalng constant. Further detals can be found n Chambers et al. (007) and Salvat et al. (009). he condtonal predcton varance of the RBLUP s V ˆ( ŷ RBLUP ) = N { a + (N n )n 1 } 1 ( y ˆμ ), (5) s where a = N w RBLUP I( ). Due to the well-known shrnkage effect assocated wth BLUPs, replacng ˆμ by the BLUP of μ under (1) n expresson (5) can lead to based estmaton of the predcton varance under the condtonal model. For ths reason, Chambers et al. (007) recommend that ˆμ be computed as the unshrunken verson of the BLUP for μ : 11

14 ˆμ = x + z B s u. (6) he condtonal bas of the RBLUP under (1) s gven by whch has the smple plug-n estmator E ( ŷ RBLUP y X,u )= w RBLUP μ s ˆB( ŷ RBLUP ) = 1 N μ ( ), r s RBLUP 1 w ˆμ s N ˆμ ( ), (7) r s wth ˆμ defned by expresson (6). he estmator of the condtonal MSE of the RBLUP can fnally be wrtten as RBLUP MSE ( ŷ )= Vˆ ŷ RBLUP ( )+ ˆB ŷ RBLUP { ( )}. (8) he condtonal MSE of the REBLUP (16) s then estmated by replacng all unknown varance components n (8) by ther estmated values. Note that: (a) ˆ = 1+ O(n 1 ) n ths case, so that ˆ wll be very close to one n most practcal applcatons. hs suggests that there s lttle to be ganed by not settng ˆ 1 when calculatng the condtonal predcton varance (5); (b) the square of the bas estmator (7) can be based for the squared bas term n the MSE estmator. hs bas can be corrected (see Chambers et al., 007), but a small sample sze could lead to ths correcton becomng unstable, so we prefer use (8) snce ths s then a conservatve estmator of the MSE of the predctor of the small area mean under model (1); (c) the heteroskedastcty-robust MSE estmator (8) gnores the extra varablty assocated wth estmaton of the varance components, and s therefore a frst order approxmaton to the actual condtonal MSE of the REBLUP. Snce use of the REBLUP wll typcally requre a large overall sample sze, we expect any consequent underestmaton of the condtonal MSE of the REBLUP to be small. he condtonal MSE estmator for the REBLUP-BC () s obtaned usng the same heteroskedastctyrobust pseudo-lnearzaton approach as outlned above for the MSE estmator for the REBLUP. he only dfference from that development s that the weghts w RBLUP used n (3) are now replaced by correspondng REBLUP-BC weghts 1

15 RBLUPBC ( w s ) = 1 1+ N n w N n 1 + x s N n x r n w A s + s + ( N n ) N n w n z B s { I s X s A s } (9) s, y where w x ˆ z û {( ) } =. Snce the REBLUP-BC s an approxmately unbased estmator of y x ˆ z û ( ) the small area mean, the squared bas term does not mpact sgnfcantly on the mean squared error estmator, and so s typcally omtted. 4. Lnearzaton-Based MSE estmaton for small area predctors In ths Secton we propose a new MSE estmator, extendng the lnearzaton approach of Street et al. (1988) to estmaton of predcton varance for estmators based on robust estmatng equatons. he MSE estmator s developed on the assumpton that the workng model for nference s an area-specfc lnear model, and so the approach condtons on area effects when appled n the context of such a model. In what follows we show how ths approach can be used for estmatng the MSE of the REBLUP (16), the EBLUP () and the MQ estmator (18). he MSE estmators of REBLUP-BC and MQ-BC are reported n the Appendx. Note that when used wth an estmator based on a mxed model, the proposed MSE estmator provdes a second order approxmaton to the true MSE snce t ncludes a term for the contrbuton to varablty from estmaton of varance components. MSE estmaton for REBLUP Under model (1) the predcton varance of the Robust BLUP of y can be expressed as Var ŷ RBLUP 1 ( y )= Var x N + z u ( ) 1 y r N r = 1 n x r Var ( )x r + 1 n z r Var ( u )z r + 1 n N N N Var ( e r ), (30) assumng ndependence between and u. It follows that we need to estmate Var( ), Var( u ) n order to be able to calculate an estmate of the predcton varance of the RBLUP. In order to do ths, put = (,u ), so = (, u ). hen, from equatons (10) and (13), H( ) = 0 where 13

16 H( ) = H ( ) H u ( ) = Z s 1/ 1/ es es X s V 1 s U 1/ 1/ s ( U s { y s X s } )= 0 ( { y s X s Z s u } ) 1/ u 1/ u u ( )= 0. Snce the solutons of the equatons depend on the nfluence functon, we denote them by a superscrpt of. We can use prevous results on the asymptotc varance of solutons to an estmatng equaton (Welsh and Rchardson, 1997; Snha and Rao, 009) to obtan a frst order approxmaton tovar ( ) and by extenson the predcton varance of the RBLUP. o do ths, we note that Var 0 ( ) { E 0 ( H 0 )} 1 Var 0 H( 0 ) { } E 0 ( H 0 ) { } 1 where Var 0 ( H ( 0 ))= Var 1/ 0 U y x { ( 0 ) } X V 1 s s U s V 1 s X s Var 0 ( H u ( 0 ))= E 0 ( e ) 1 y x 0 z { ( u 0 ) } Z 1 s es Z s, (31) and E 0 { H ( 0 )}= X s V 1 s U 1/ s E 0 E 0 { u H u ( 0 )}= Z s 1/ es E 0 1/ { U s ( y s X s 0 )} U 1/ X s s 1/ es y s X s { ( 0 Z s u 0 )} 1/ Z es s 1/ u E 1/ { u u 0 } u (3) 1/. he prevous expressons lead to the estmator: Var ( )= Ê H 0 Var u ( )= Ê H u 0 { ( )} 1 Var { H ( 0 )} { Ê ( H 0 )} 1 { ( )} 1 Var { Hu ( 0 )} { Ê ( u H 0 )} 1, (33) where Ê { H ( 0 )}= X s V 1 s U 1/ s RU 1/ s X S, Ê { u H u ( 0 )}= Z s 1/ es 1/ es Z s 1/ u Q 1/ u, Var { H ( 0 )}= n p Var { Hu ( 0 )}= n p ( ) ( ) 1 n 1 r =1 ( ) ( ) 1 n t =1 X s V s 1 U s V s 1 X s, and Z s 1 es Z s. Here, assumng use of a Huber Proposal nfluence functon, R s a n n dagonal matrx wth -th 1/ dagonal element s 1 f c < r < c, 0 otherwse, wth r = U y x ( ); the constant c represents the cutoff of the bounded nfluence functon; s a dagonal matrx of dmenson n n wth -th element 14

17 dagonal element equal to 1 f c < t < c, 0 otherwse, wth t = ( e ) 1 y x ( z u ); Q s a m m dagonal matrx wth -th dagonal element equal to 1 f c < q < c, 0 otherwse, wth q = ( u ) 1/ u. he values 1 = 1+ p Var n corrector terms (Huber, 1981). ( ( r ) ) E r ( ( )) and = 1+ p Var n t An estmator of the predcton varance of RBLUP can be wrtten as: where h 1 ( )= 1 n N ˆ V ( ŷ RBLUP y )= h 1 ( )+ h ( )+ h 3 ( ( )) E ( t ) ( ) are bas ( ) (34) z ˆ r V ( u )z r s due to the estmaton of random effects, whle the second term h ( )= 1 n N x ˆ r V estmated from the area data: ˆ V (e r ) = ˆ V (e r ) = 1 (N n )(n 1) ( )x r s due to the estmator. he term h 3 ( )= 1 n h sh 1 (N n )(n 1) s N ( ) can be ˆ V e r y x ( z u ), or from the entre data set: y x ( z u ). Moreover, snce we are workng under the condtonal approach, we have to add to the varance estmator (34) an estmator of the squared bas term. he result s that the estmator of the condtonal MSE of the RBLUP can be wrtten as: RBLUP MSE ( ŷ )= h 1 ( )+ h ( )+ h 3 ( )+ { ˆB ( ŷ )} RBLUP, (35) where the ˆB ŷ RBLUP ( ) s the expresson (7) developed n the prevous Secton. he correspondng estmator of the condtonal MSE of the REBLUP (16) s obtaned by addng an extra component to expresson (35) due to the varablty of the estmated varance components: REBLUP MSE ( ŷ RBLUP )= MSE ( ŷ )+ E ŷreblup ŷ RBLUP ( ). (36) he last term s ntractable and t s therefore necessary to approxmate t. An approxmaton of ths term s obtaned by aylor approxmaton followng the results of Prasad and Rao (1990). Under the condtonal approach ŷ REBLUP ŷ RBLUP 1 N z r k=1 ( k B s )( y s X s ){ ˆ k k }, 15

18 where B s s defned as n prevous Secton, and = ( u, e ) s the vector of the varance components. Assumng that the dervatve of ( ) wth respect to s of lower order, the term E ( ) ŷ REBLUP ŷrblup n (36) s then estmated by h 4 1 ( )= N z r Var( ˆ 1 u, ˆ e ) N z r + om ( ) 1 (37) where ( ) z u {( ) z l u } ( ) = u, e B s ( )+ e I( = l) u l, e B s. k=1 Note that Var( ˆ u, ˆ e ) n (37) s obtaned usng the results of the asymptotc dstrbuton of ( u, e ) gven n Snha and Rao (009). he MSE estmator of the REBLUP (16) then becomes: REBLUP MSE ( ŷ )= h 1 ( )+ h ( )+ h 3 ( )+ { ˆB ( ŷ )} RBLUP + h 4 ( ). (38) An estmator of REBLUP MSE ( ŷ ) can be obtaned by replacng all unknown varance components n (38) by ther estmated values ˆ. hs corresponds to substtutng = (, u ) by ˆ = MSE approxmaton (38) and leads to: We have Eh ˆ ( ) ( )= h ˆ 1 ( )+ h ˆ ( )+ h ˆ 3 ( )+ mse ŷ REBLUP = h ( )+ om 1 ( ), Eh ˆ 3 ( ) = h 3 ( )+ om 1 { ˆB ( ŷ )} REBLUP + h ˆ 4 ( ), Eh ˆ 4 ( ) desred order of approxmaton. However, h ˆ 1 ( )s not the correct estmator of h 1 ( ˆ,û ) n the ( ). (39) = h 4 ( )+ om 1 ( ) to the ( ) because ts bas s generally of the same order as h ˆ ( ), h ˆ 3 ( ), h ˆ 4 ( ). o evaluate the bas of h ˆ 1 ( ), we use a aylor seres expanson of h 1 ˆ ( ) around = u, e h 1 ( ): ( ˆ )= h 1 ( )+ ˆ = h 1 ( ) ( ) h 1 ( )+ 1 ( ˆ ) h 1 ( ) ( ) ˆ If ˆ s unbased for then E 1 = 0. In general, f ˆ s based, E 1 s of lower order than E, 16

19 so Eh 1 ( ˆ ) h 1 ( )+ 1 tr h 1 ( )E ˆ = h 1 ( )+ 1 1 n N ( ) ( ˆ ) tr { z r h 1 ( )z Var( ˆ u, ˆ e )}+ om ( 1 ). We denote the second term on the rght hand sde above by h ˆ 5 ( ). he estmator of the MSE of ŷ REBLUP s then: and E mse ŷ REBLUP ( ) ( )= h ˆ 1 ( )+ h ˆ ( )+ h ˆ 3 ( )+ ˆB ŷ mse ŷ REBLUP = MSE REBLUP ( ŷ )+ om ( 1 ). REBLUP { ( )} + h ˆ 4 ( )+ h ˆ 5 ( ) (40) MSE estmaton for EBLUP he second predctor of y that we consder s the well-known EBLUP based on (1). Note that EBLUP s a partcular case of REBLUP when the bounded nfluence functon s replaced by the (unbounded) dentty functon. Under (1) the predcton varance of the BLUP of y s Var ( ŷ RBLUP y )= 1 n N Var ( )x r + 1 n x r N z r Var ( u )z r + 1 n N Var ( e r ) (41) assumng ndependence between and u. Puttng = (,u ), so = (, u ) and usng results on the asymptotc varance of solutons to estmatng equatons (Rchardson and Welsh, 1997), H( ) = H ( ) X s 1 es ( y s X s Z s u)= 0 = H u ( ) Z s 1 es ( y s X s Z s u) 1 u u = 0, we obtan frst order approxmaton to Var ( ) and by extenson the predcton varance of the BLUP. he startng pont s Var 0 ( ) { E 0 ( H 0 )} 1 Var 0 H( 0 ) Var ( )= Ê H 0 Var ( u )= Ê H u 0 { } E 0 ( H 0 ) { } 1 { ( )} 1 Var { H ( 0 )} Ê ( H 0 ) { } 1 { ( )} 1 Var { Hu ( 0 )} Ê ( u H 0 ) { } 1, whch leads to the estmators: (4) where Ê { H ( 0 )}= X s V 1 s X s, 17

20 Ê { u H u ( 0 )}= Z s 1 es Z s 1 u, ( ) Var { H ( 0 )}= ( n p) 1 n y x =1 X V 1 s s V 1 s X s, and Var { Hu ( 0 )}= ( n p) 1 n y x ( z u) =1 Z 1 s es 1 es Z s. An estmator of the MSE for the BLUP can therefore be wrtten as: where h 1 ( )= 1 n N BLUP MSE ( ŷ )= h 1 ( )+ h ( )+ h 3 ( ) (43) z ˆ r V ( u )z r s due to the estmaton of random effects, whle the second term h ( )= 1 n N x ˆ r V area data, ˆ V (e r ) = ˆ V (e r ) = 1 (N n )(n 1) ( )x r s due to. he term h 3 ( )= 1 n 1 (N n )(n 1) h sh s y x ( z u ) N ( ) can be estmated ust usng ˆ V e r, or by usng all the sample data, y x ( z u ). Note that we have not added the squared bas estmator to (43) as we dd n the REBLUP case because ths bas s zero (see Chambers et al., 007). In order to defne the condtonal MSE of the EBLUP, we add the term h 4 ( ), see equaton (37), to (43). In the case of the EBLUP predctor for the small area mean, h 4 ( ) contans two dfferences wth respect to the same expresson developed for REBLUP: ) the matrx B s = u Z s V s 1 ; ) Var( ˆ u, ˆ e ) s obtaned usng the results of the asymptotc dstrbuton of ˆ ( u, ˆ e ) gven by Rao (003). he MSE of the EBLUP () s therefore and ts estmator can be wrtten as: EBLUP MSE ( ŷ )= h 1 ( )+ h ( )+ h 3 ( )+ h 4 ( ) (44) snce Eh 1 ( )= h ˆ 1 ( )+ h ˆ ( )+ h ˆ 3 ( )+ h ˆ 4 mse ŷ EBLUP ( ˆ ) h 1 ( ) h 4 ( )+ om ( 1 ). ( ) (45) 18

21 MSE estmaton for MQ he thrd predctor that we consder s the MQ predctor (18) based on the M-quantle approach (Chambers and zavds, 006). For fxed q, the predcton varance of the MQ predctor s Var( ŷ MQ y ) = 1 n N { x r Var ( ˆ q )x r } + 1 n N Var ( e r ). (46) It follows that we need to estmate Var ( ˆ q ) n order to be able to calculate an estmate of the predcton varance of ths predctor. he startng pont, as usual, s the frst order approxmaton based on the estmatng equatons for ˆq. Puttng q = q, { ( )} Var ˆq 0 ( ) E 0 q H 0 1 Var 0 { ( )} { H( 0q )} E 0 q H 0 1 (47) wth n H( 0q ) = x q (r ) = X s q (r 0q ) =1 where q s a bounded nfluence functon dependng on q, q (r 0q ) s the n-vector wth elements q (r 0q ) = q { 1 0q ( y x 0q )} and 0q s a robust estmator of the scale of the resdual y x 0q. he Var 0 { H( 0q )} component of expresson (47) can then be wrtten as Var 0 { H( 0q )}= X s { E 0 { q (r 0q ) q (r 0q )}}X s, because the y values are condtonally uncorrelated and E 0 { q (r 0q )}= 0 for each q. Assumng a Huber-type nfluence functon, we obtan E 0 ( q H 0q ) = X s E 0 d d q q (r 0q ) q = 0 q = X s CX s, where C s a n n dagonal matrx wth -th dagonal component 1 0q E 0q { qi ( 0 < r 0q c)+ ( 1 q)i ( c < r 0q 0) }. hese expressons then lead to two types of estmators: 1. Var ( ˆq ) = n(n p) 1 { Ê ( q H 0q )} 1 { } Ê q H 0q Var H(0q ) { ( )} 1 (48) 19

22 where Var { H(0q )}= X s ˆFX s, ˆF s a dagonal matrx of dmenson n n wth -th element equal to n ˆf = ŵ q ˆr q ŵ q ˆr q =1 ; Ê ( q H 0q ) = X ĈX s s where Ĉ s a n n dagonal matrx wth -th 1 element ĉ = ˆ q { qi ( 0 < ˆr q c)+ ( 1 q)i ( c < ˆr q 0) }. Here ŵ q s the fnal weght n the teratve 1 re-weghted least squared (IRLS) process, and ˆr q = ˆ q ( y x ˆq ). Note the factor n(n p) 1 whch ensures agreement wth Street et al. (1988) when X s = 1 and q = 0.5. he = 1+ p n Var ( ˆr q ( q ) ) E q ( ( ˆr q )). Var ( ˆq ) = value s the bas corrector term (Huber, 1981). ( ˆr q ) ( n p) 1 n q =1 X s X s n n 1 q ( ˆr q ) =1 hat s, the Street et al. (1988) estmator when q = 0.5. ( ) 1. (49) Dependng on whch of (48) or (49) s used, the estmator of the predcton varance of the MQ predctor when q = q can be wrtten as: wth ˆ V (e r ) = ˆ V 1 (N n )(n 1) ŷ MQ ( )= 1 n h sh we have to add an estmator of the squared bas based on: N x r Var ˆq { ( )x r } + 1 n N ( ) (50) ˆ V e r y x ˆqh ( ). Moreover, snce we are takng a condtonal approach, 1 ( )= N w x ˆqk x ˆq k s k ˆB ŷmq where and b = x r N n n x s N + b w = n f b otherwse (X s W(q )X s ) 1 X s W(q ) s a 1 n vector. he fnal expresson for the 0

23 MSE estmator of the MQ predctor s therefore: MQ MSE ( ŷ )= 1 n N x r Var ˆq { ( )x r } + 1 n N 1 ( )+ N w x ˆqk x ˆq. (51) k s k ˆ V e r Note that (50) s a frst order approxmaton to the asymptotc predcton varance of the MQ predctor, and so (51) could underestmate ts MSE. 5. Results from Model-Based Smulaton Studes We provde model-based smulaton results llustratng the comparatve performances of the dfferent outler robust small area predctors descrbed above. Populaton data are generated for m = 40 small areas, wth samples selected by smple random samplng wthout replacement wthn each area. Populaton and sample szes are the same for all areas, and are fxed at ether N = 100, n = 5 or N = 300, n = 15. Values for X are generated as ndependently and dentcally dstrbuted from a lognormal dstrbuton wth a mean of and a standard devaton of 0.5 on the log scale. Values for Y are generated as y = x + u +, where the random area and ndvdual effects are ndependently generated accordng to four scenaros: [0,0] No outlers: u N(0,3) and N(0,6). [e,0] Indvdual outlers only: u N(0,3) and N(0,6) + (1 )N(0,150), where s an ndependently generated Bernoull random varable wth Pr( = 1) = 0.97,.e. the ndvdual effects are ndependent draws from a mxture of two normal dstrbutons, wth 97% on average drawn from a wellbehaved N(0,6) dstrbuton and 3% on average drawn from an outler N(0,150) dstrbuton. [0,u] Area outlers only: u N(0,3) for areas 1-36, u N(9,0) for areas and N(0,6),.e. random effects for areas 1 36 are drawn from a well behaved N(0,3) dstrbuton, wth those for areas drawn from an outler N(9,0) dstrbuton. Indvdual effects are not outler-contamnated. [e,u] Outlers n both area and ndvdual effects: u N(0,3) for areas 1-36, u N(9,0) for areas and N(0,6) + (1 )N(0,150). Each scenaro s ndependently smulated 500 tmes. For each smulaton the populaton values are generated accordng the underlyng scenaro model, a sample s selected n each area and the sample 1

24 data are then used to compute estmates of each of the actual area means for Y. Fve dfferent estmators are used for ths purpose - the standard EBLUP, see (), whch serves as a reference; the proectve M-quantle estmator MQ, see (18); the robust bas-corrected predctve MQ estmator MQ-BC, see (1); the robust proectve REBLUP estmator of Snha and Rao (009), see (16); and ts robust bas-corrected verson REBLUP-BC, see (). In all cases the proectve nfluence functon s a Huber Proposal type wth tunng constant c = In contrast, the predctve, less restrctve, nfluence functon used n MQ-BC and REBLUP-BC s also a Huber Proposal type, but wth a larger tunng constant, c = 3. he performance of these estmators across the dfferent areas and smulatons s assessed by computng the medan values of ther area specfc relatve bas and relatve root mean squared error, where the relatve bas of an estmator ŷ for the actual mean y of area s the average across smulatons of the errors ŷ y dvded by the correspondng average value of y, and ts relatve root mean squared error s the square root of the average across smulatons of the squares of these errors, agan dvded by the average value of y. able 1 sets out these medan values for the dfferent smulaton scenaros and dfferent estmators. he relatve bas results set out n able 1 confrm our expectatons regardng the behavour of proectve estmators (EBLUP, REBLUP and MQ) versus bas-corrected predctve estmators (REBLUP-BC and MQ-BC). he former are more based than the latter as a consequence of ther mplct assumpton that although outler varances may be nflated relatve to non-outlers, outler effects stll have zero expectaton. hs ncrease n bas s most pronounced when there are outlers n the area effects, whch s not unexpected snce that s when area means are most affected by the presence of outlers n the populaton data. urnng to the medan RRMSE results, we see that clams n the lterature (e.g. Chambers and zavds, 006) about the superor outler robustness of MQ compared wth the EBLUP certanly hold true provded the outlers are n ndvdual effects. If there are outlers n area effects, then MQ appears to offer no extra protecton compared to the EBLUP, and n fact performs worse, manly due to ts sharply ncreasng bas n ths stuaton. Smlarly, when we compare the EBLUP and the REBLUP we see that f outlers are assocated wth ndvdual effects, then REBLUP offers better RRMSE performance than EBLUP. However, the gap between these two

25 estmators narrows consderably when outlers are assocated wth area effects. In contrast, the two bascorrected predctve estmators seem relatvely robust n terms of RRMSE performance. Due to ncreased varablty as a consequence of ther bas correctons, both BC estmators are not as effcent as the proectve estmators when outlers are assocated wth ndvdual effects, but both also do not fal when there are outlers n the area effects. We now turn to an examnaton of the performance of dfferent methods of MSE estmaton nvestgated n the smulatons. MSE estmaton for the REBLUP and REBLUP-BC s mplemented va the robust MSE estmators (8) and (9) (hereafter CC) and va the lnearzaton-based MSE estmators (40) and (A6) (hereafter CCS), whle for the MQ and the MQ-BC both (51) and (A9) (CCS) and the robust MSE estmator descrbed n Chambers et al. (007, Secton.3 - CC) are calculated. he bootstrap procedure proposed by Snha and Rao (009) for REBLUP s also nvestgated by usng bootstrap samples of szes 100. he MSE of the EBLUP estmator s estmated by Prasad-Rao (PR), CC (Chambers et al., 007, Secton.3) and CCS (45) estmators. he behavour of the MSE estmators for each scenaro and for each approach s shown n able where we report the medan values of ther area specfc relatve bas, relatve root mean squared error and coverage rate for a nomnal 95 per cent confdence nterval. hese ntervals are based on normal theory and are defned by the small area mean estmate plus or mnus twce ther correspondng estmated root mean squared error. hese results show that both CC and CCS tend to be based low, but CCS s better n terms of coverage rate. It shows a small amount of under-coverage for all predctors. he CCS estmator s preferable to CC for REBLUP and REBLUP-BC. It shows smaller bas and more stablty. Moreover t seems that CCS s better able to handle the scenaros where outlers are present. he CC and CCS estmators perform smlarly for MQ-BC, even f CCS seems more stable. he PR estmator of MSE does well: t s very stable and shows good bas propertes except n the presence of area level outlers, when t s based downwards sgnfcantly. he bas propertes of the bootstrap MSE estmator for REBLUP and REBLUP-BC are comparable wth CCS, but t s much more stable. 3

26 6. Desgn-Based Smulaton Study Desgn-based smulatons complement model-based smulatons for small area estmaton snce they allow us to evaluate the performance of small area estmaton methods n the context of a real populaton and realstc samplng methods where we do not know the precse source of the contamnaton. From a practcal perspectve we beleve that ths type of smulaton, by effectvely fxng the dfferences between the small areas, consttutes a more practcal and approprate representaton of the small area estmaton problem from a fnte populaton perspectve. Further, t provdes a good llustraton of why a focus on condtonal MSE s lkely to be closer to the MSE of nterest for people usng small area methods. he populaton underpnnng the desgn-based smulaton s based on a data set obtaned under the Envronmental Montorng and Assessment Program (EMAP) of the U.S. Envronmental Protecton Agency. he background to ths data set s that between 1991 and 1995, EMAP conducted a survey of lakes n the North-Eastern states of the U.S. he data collected n ths survey conssts of 551 measurements from a sample of 334 of the 1,06 lakes located n ths area. he lakes makng up ths populaton are grouped nto dgt Hydrologc Unt Codes (HUCs), of whch 64 contaned less than 5 observatons and 7 dd not have any. In our smulaton, we defned HUCs as the small areas of nterest, wth lakes grouped wthn HUCs. he varable of nterest s Acd Neutralzng Capacty (ANC), an ndcator of the acdfcaton rsk of water bodes. A total of 1000 ndependent random samples of lake locatons are then taken from the populaton of 1,06 lake locatons by randomly selectng locatons n the 86 HUCs that contanng EMAP sampled lakes, wth sample szes n these HUCs set to the greater of fve and the orgnal EMAP sample sze. Detals on the data generaton are n Salvat et al. (008). able 3 shows the medan relatve bas and the medan relatve root MSE of the dfferent predctors (EBLUP, REBLUP, MQ, REBLUP-BC, MQ-BC). Smlarly, able 4 report the medan relatve bas, the medan relatve root MSE and the medan coverage rate of the correspondng estmators of the MSEs of these predctors calculated from the same sample. MQ-BC and REBLUP-BC predctors work well n terms of both bas and MSE, whle the EBLUP s the worst n terms of relatve root MSE. he REBLUP shows a good performance n 4

27 terms of RRMSE but records a bg negatve bas. he MQ predctor shows the worst behavour n terms of bas and MSE. We now turn to an examnaton of the performance of dfferent methods of MSE estmaton nvestgated n the desgn-based smulaton. he Prasad-Rao (PR) estmator of the MSE of the EBLUP has an upward bas and larger nstablty than the CCS estmator for the EBLUP. hs could be due to the uncondtonal bass of the PR estmator. he CCS estmator seems to offer the best overall results wth REBLUP and REBLUP-BC, whle CC and CCS show smlar performance n terms of bas and RRMSE for MQ-BC. In ths smulaton experment the MSE estmaton of the MQ predctor s problematc for both CC and CCS. he bootstrap MSE estmator does not work for the REBLUP, showng bg bas and nstablty, whereas t s a good compettor for CC and CCS as far as REBLUP-BC s concerned. he coverage rates (for nomnal 95 percent ntervals) are presented n able 4. he CCS estmaton method produces ntervals wth medan coverage close to 95 percent for EBLUP, REBLUP and REBLUP-BC. It records substantal under-coverage for MQ and MQ-BC, even f, for these estmators, t performs better than CC. he bootstrap MSE estmator shows a degree of over-coverage for REBLUP. hs occurs because the bootstrap method assumes that the lnear mxed model (1) holds for the small areas, whereas ths assumpton s dffcult to meet n many practcal applcatons. A fnal comment s approprate consderng the results on the coverage rate. Chatteree et al. (008), dscussng the use of bootstrap methods for constructng confdence ntervals for small area parameters, argue that there s no guarantee that the asymptotc behavour underpnnng normal theory confdence ntervals apples n the context of the small samples that characterze small area estmaton. For ths reason the authors do not recommend the use of the normal theory to construct the predcton ntervals (as we have done here). he behavour of the emprcal true root MSE and ts estmators for each area and for each approach are shown n Fgures 1, and 3. Examnaton of these results can be useful for understandng the reasons for dfferent performances of the MSE estmators. Fgure 1 shows the results for EBLUP predctor and we can note that the PR estmator does not seem to be able to capture between area dfferences n the desgn-based RMSE of the EBLUP, whle the CC MSE 5

28 estmator for the EBLUP tracks the rregular profle of the area-specfc emprcal MSE very well. Also CCS works qute well but produces somewhat over-smoothed estmates of area-specfc emprcal MSE. hese results confrm the poor desgn-based propertes of the PR estmator (Longford, 007). Fgure reports the results for REBLUP and REBLUP-BC predctors. For the REBLUP (top) t s evdent that CC tends to underestmate the true area-specfc MSE, manly because ts squared bas component underestmates the actual squared bas of ths predctor. he bootstrap MSE estmator produces over-smoothed estmates of area-specfc emprcal MSE, because n ths smulaton the assumpton that lnear mxed model (1) holds s volated. he CCS estmator tracks area-specfc emprcal MSE but t shows underestmaton n a few areas. It can be seen that the CCS MSE estmator for the REBLUP-BC (bottom) has the best performance and tracks the rregular profle of the area-specfc emprcal MSE very well, whle the bootstrap MSE estmator for the REBLUP-BC generates over-smoothed estmates of area-specfc emprcal MSE. Fgure 3 llustrates the results for MQ (top) and MQ-BC (bottom) predctors. he MSE estmators have a smlar behavour. hey track the rregular profle of the area-specfc emprcal MSE very well for MQ-BC, whle, for MQ, the CC and CCS underestmates the true area-specfc MSE. 7. Fnal Remarks In ths paper we explore the extenson of the Robust Predctve approach to small area estmaton and we propose two dfferent analytcal mean squared error (MSE) estmators for outler robust predctors of small area means. he frst proposal s a bas-robust MSE estmator based on the 'pseudo-lnearzaton' approach dscussed by Chambers et al. (007). he second method s a lnearzaton-based MSE estmaton based on frst order approxmatons to the varances of solutons of estmatng equatons. he emprcal results n Sectons 5 and 6 show that the bas-corrected predctve estmators (REBLUP- BC and MQ-BC) are less based than the proectve estmators (EBLUP, REBLUP and MQ) especally when there are outlers n the area effects. From the results of the smulaton experments there s evdence that the BC estmators are not as effcent as the proectve estmators when outlers are assocated wth ndvdual 6

29 effects. hs s due to ncreased varablty as a consequence of ther bas correctons. We can note also that REBLUP-BC and MQ-BC do not fal when there are outlers n the area effects. A method to compute the optmal cut-off value c for the functon and mprove the effcency of the BC estmators remans to be done. A cross-valdaton approach could be a possble method. he pseudo-lnearzaton and lnearzaton-based MSE estmators descrbed n Secton 4 and n the Appendx can be an alternatve to bootstrap MSE estmaton for REBLUP and REBLUP-BC. Moreover, the CCS estmator shows a good performance also for MQ-type estmators. Overall, the CCS method performs reasonably well for the dfferent small area predctors that we have compared n both model-based and desgnbased smulaton experments. We also note that the Prasad-Rao estmator of the EBLUP and the bootstrap MSE estmator of the REBLUP proposed by Snha and Rao (009), whch work well when ther model assumptons are vald, have problems, especally n terms of bas, n the presence of outlers. In the modelbased smulatons the CCS estmator performs qute well n all scenaros and t works better than PR and bootstrap-type MSE estmators when there outlers n the area and ndvdual effects n terms of bas, stablty and coverage rate. Recently, the CC estmator has been extended to estmatng the MSE of M-quantle Geographcally Weghted Regresson small area estmators (Salvat et. al., 008) and to predctors based on nonparametrc small area models (Salvat et al., 009). It could be nterestng to explore whether the CCS estmator can also be used n these cases, or wth nonparametrc M-quantle small area estmators (Prates et al. 008). Fnally, the CCS MSE estmator presented n ths paper s developed under the condtonal verson of the lnear mxed model,.e. t s condtoned on area effects when appled n the context of a mxed model. However, t s possble to develop an uncondtonal verson of the CCS MSE estmator that averages over the dstrbuton of the random area effects under a lnear mxed model, and so reduces to the Prasad-Rao MSE estmator n the case of the EBLUP. hs s an avenue for further research. Appendx 7

On Outlier Robust Small Area Mean Estimate Based on Prediction of Empirical Distribution Function

On Outlier Robust Small Area Mean Estimate Based on Prediction of Empirical Distribution Function On Outler Robust Small Area Mean Estmate Based on Predcton of Emprcal Dstrbuton Functon Payam Mokhtaran Natonal Insttute of Appled Statstcs Research Australa Unversty of Wollongong Small Area Estmaton

More information

Bias-correction under a semi-parametric model for small area estimation

Bias-correction under a semi-parametric model for small area estimation Bas-correcton under a sem-parametrc model for small area estmaton Laura Dumtrescu, Vctora Unversty of Wellngton jont work wth J. N. K. Rao, Carleton Unversty ICORS 2017 Workshop on Robust Inference for

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Robust Small Area Estimation Using a Mixture Model

Robust Small Area Estimation Using a Mixture Model Robust Small Area Estmaton Usng a Mxture Model Jule Gershunskaya U.S. Bureau of Labor Statstcs Partha Lahr JPSM, Unversty of Maryland, College Park, USA ISI Meetng, Dubln, August 23, 2011 Parameter of

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Small Area Estimation for Business Surveys

Small Area Estimation for Business Surveys ASA Secton on Survey Research Methods Small Area Estmaton for Busness Surveys Hukum Chandra Southampton Statstcal Scences Research Insttute, Unversty of Southampton Hghfeld, Southampton-SO17 1BJ, U.K.

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 31 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 6. Rdge regresson The OLSE s the best lnear unbased

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Numerical Heat and Mass Transfer

Numerical Heat and Mass Transfer Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VIII LECTURE - 34 ANALYSIS OF VARIANCE IN RANDOM-EFFECTS MODEL AND MIXED-EFFECTS EFFECTS MODEL Dr Shalabh Department of Mathematcs and Statstcs Indan

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

4.3 Poisson Regression

4.3 Poisson Regression of teratvely reweghted least squares regressons (the IRLS algorthm). We do wthout gvng further detals, but nstead focus on the practcal applcaton. > glm(survval~log(weght)+age, famly="bnomal", data=baby)

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

A nonparametric two-sample wald test of equality of variances

A nonparametric two-sample wald test of equality of variances Unversty of Wollongong Research Onlne Centre for Statstcal & Survey Methodology Workng Paper Seres Faculty of Engneerng and Informaton Scences 0 A nonparametrc two-sample wald test of equalty of varances

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

Bayesian predictive Configural Frequency Analysis

Bayesian predictive Configural Frequency Analysis Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise. Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the

More information

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method

More information

Lecture 3 Stat102, Spring 2007

Lecture 3 Stat102, Spring 2007 Lecture 3 Stat0, Sprng 007 Chapter 3. 3.: Introducton to regresson analyss Lnear regresson as a descrptve technque The least-squares equatons Chapter 3.3 Samplng dstrbuton of b 0, b. Contnued n net lecture

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable

More information

Chapter 5 Multilevel Models

Chapter 5 Multilevel Models Chapter 5 Multlevel Models 5.1 Cross-sectonal multlevel models 5.1.1 Two-level models 5.1.2 Multple level models 5.1.3 Multple level modelng n other felds 5.2 Longtudnal multlevel models 5.2.1 Two-level

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Small Area Estimation Under Spatial Nonstationarity

Small Area Estimation Under Spatial Nonstationarity Small Area Estmaton Under Spatal Nonstatonarty Hukum Chandra Indan Agrcultural Statstcs Research Insttute, New Delh Ncola Salvat Unversty of Psa Ray Chambers Unversty of Wollongong Nkos Tzavds Unversty

More information

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography

CSci 6974 and ECSE 6966 Math. Tech. for Vision, Graphics and Robotics Lecture 21, April 17, 2006 Estimating A Plane Homography CSc 6974 and ECSE 6966 Math. Tech. for Vson, Graphcs and Robotcs Lecture 21, Aprl 17, 2006 Estmatng A Plane Homography Overvew We contnue wth a dscusson of the major ssues, usng estmaton of plane projectve

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics ECOOMICS 35*-A Md-Term Exam -- Fall Term 000 Page of 3 pages QUEE'S UIVERSITY AT KIGSTO Department of Economcs ECOOMICS 35* - Secton A Introductory Econometrcs Fall Term 000 MID-TERM EAM ASWERS MG Abbott

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information

Discussion of Extensions of the Gauss-Markov Theorem to the Case of Stochastic Regression Coefficients Ed Stanek

Discussion of Extensions of the Gauss-Markov Theorem to the Case of Stochastic Regression Coefficients Ed Stanek Dscusson of Extensons of the Gauss-arkov Theorem to the Case of Stochastc Regresson Coeffcents Ed Stanek Introducton Pfeffermann (984 dscusses extensons to the Gauss-arkov Theorem n settngs where regresson

More information

e i is a random error

e i is a random error Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where + β + β e for,..., and are observable varables e s a random error How can an estmaton rule be constructed for the unknown

More information

/ n ) are compared. The logic is: if the two

/ n ) are compared. The logic is: if the two STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 8. SIMPLE LINEAR REGRESSION III Ftted Values and Resduals US Domestc Beers: Calores vs. % Alcohol To each observed x, there corresponds a y-value on the ftted lne, y ˆ = βˆ + βˆ x. The are called ftted

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Non-parametric bootstrap mean squared error estimation for M-quantile estimates of small area means, quantiles and poverty indicators *

Non-parametric bootstrap mean squared error estimation for M-quantile estimates of small area means, quantiles and poverty indicators * Non-parametrc bootstrap mean squared error maton for M-quantle mates of small area means quantles and poverty ndcators * Stefano Marchett 1 Monca Prates 2 Nos zavds 3 1 Unversty of Psa e-mal: stefano.marchett@for.unp.t

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM

ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM An elastc wave s a deformaton of the body that travels throughout the body n all drectons. We can examne the deformaton over a perod of tme by fxng our look

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

Lecture 12: Discrete Laplacian

Lecture 12: Discrete Laplacian Lecture 12: Dscrete Laplacan Scrbe: Tanye Lu Our goal s to come up wth a dscrete verson of Laplacan operator for trangulated surfaces, so that we can use t n practce to solve related problems We are mostly

More information

Chapter 6. Supplemental Text Material

Chapter 6. Supplemental Text Material Chapter 6. Supplemental Text Materal S6-. actor Effect Estmates are Least Squares Estmates We have gven heurstc or ntutve explanatons of how the estmates of the factor effects are obtaned n the textboo.

More information

Non-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT

Non-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT Malaysan Journal of Mathematcal Scences 8(S): 37-44 (2014) Specal Issue: Internatonal Conference on Mathematcal Scences and Statstcs 2013 (ICMSS2013) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal

More information

Statistics II Final Exam 26/6/18

Statistics II Final Exam 26/6/18 Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the

More information

18. SIMPLE LINEAR REGRESSION III

18. SIMPLE LINEAR REGRESSION III 8. SIMPLE LINEAR REGRESSION III US Domestc Beers: Calores vs. % Alcohol Ftted Values and Resduals To each observed x, there corresponds a y-value on the ftted lne, y ˆ ˆ = α + x. The are called ftted values.

More information

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva Econ 39 - Statstcal Propertes of the OLS estmator Sanjaya DeSlva September, 008 1 Overvew Recall that the true regresson model s Y = β 0 + β 1 X + u (1) Applyng the OLS method to a sample of data, we estmate

More information

Efficient nonresponse weighting adjustment using estimated response probability

Efficient nonresponse weighting adjustment using estimated response probability Effcent nonresponse weghtng adjustment usng estmated response probablty Jae Kwang Km Department of Appled Statstcs, Yonse Unversty, Seoul, 120-749, KOREA Key Words: Regresson estmator, Propensty score,

More information

Chapter 12 Analysis of Covariance

Chapter 12 Analysis of Covariance Chapter Analyss of Covarance Any scentfc experment s performed to know somethng that s unknown about a group of treatments and to test certan hypothess about the correspondng treatment effect When varablty

More information

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

2016 Wiley. Study Session 2: Ethical and Professional Standards Application 6 Wley Study Sesson : Ethcal and Professonal Standards Applcaton LESSON : CORRECTION ANALYSIS Readng 9: Correlaton and Regresson LOS 9a: Calculate and nterpret a sample covarance and a sample correlaton

More information

β0 + β1xi and want to estimate the unknown

β0 + β1xi and want to estimate the unknown SLR Models Estmaton Those OLS Estmates Estmators (e ante) v. estmates (e post) The Smple Lnear Regresson (SLR) Condtons -4 An Asde: The Populaton Regresson Functon B and B are Lnear Estmators (condtonal

More information

An (almost) unbiased estimator for the S-Gini index

An (almost) unbiased estimator for the S-Gini index An (almost unbased estmator for the S-Gn ndex Thomas Demuynck February 25, 2009 Abstract Ths note provdes an unbased estmator for the absolute S-Gn and an almost unbased estmator for the relatve S-Gn for

More information

A Bound for the Relative Bias of the Design Effect

A Bound for the Relative Bias of the Design Effect A Bound for the Relatve Bas of the Desgn Effect Alberto Padlla Banco de Méxco Abstract Desgn effects are typcally used to compute sample szes or standard errors from complex surveys. In ths paper, we show

More information

The Ordinary Least Squares (OLS) Estimator

The Ordinary Least Squares (OLS) Estimator The Ordnary Least Squares (OLS) Estmator 1 Regresson Analyss Regresson Analyss: a statstcal technque for nvestgatng and modelng the relatonshp between varables. Applcatons: Engneerng, the physcal and chemcal

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS)

Some Comments on Accelerating Convergence of Iterative Sequences Using Direct Inversion of the Iterative Subspace (DIIS) Some Comments on Acceleratng Convergence of Iteratve Sequences Usng Drect Inverson of the Iteratve Subspace (DIIS) C. Davd Sherrll School of Chemstry and Bochemstry Georga Insttute of Technology May 1998

More information

Andreas C. Drichoutis Agriculural University of Athens. Abstract

Andreas C. Drichoutis Agriculural University of Athens. Abstract Heteroskedastcty, the sngle crossng property and ordered response models Andreas C. Drchouts Agrculural Unversty of Athens Panagots Lazards Agrculural Unversty of Athens Rodolfo M. Nayga, Jr. Texas AMUnversty

More information

STAT 511 FINAL EXAM NAME Spring 2001

STAT 511 FINAL EXAM NAME Spring 2001 STAT 5 FINAL EXAM NAME Sprng Instructons: Ths s a closed book exam. No notes or books are allowed. ou may use a calculator but you are not allowed to store notes or formulas n the calculator. Please wrte

More information

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 9: Statistical Inference and the Relationship between Two Variables Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

Uncertainty as the Overlap of Alternate Conditional Distributions

Uncertainty as the Overlap of Alternate Conditional Distributions Uncertanty as the Overlap of Alternate Condtonal Dstrbutons Olena Babak and Clayton V. Deutsch Centre for Computatonal Geostatstcs Department of Cvl & Envronmental Engneerng Unversty of Alberta An mportant

More information

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X).

where I = (n x n) diagonal identity matrix with diagonal elements = 1 and off-diagonal elements = 0; and σ 2 e = variance of (Y X). 11.4.1 Estmaton of Multple Regresson Coeffcents In multple lnear regresson, we essentally solve n equatons for the p unnown parameters. hus n must e equal to or greater than p and n practce n should e

More information

Supporting Information

Supporting Information Supportng Informaton The neural network f n Eq. 1 s gven by: f x l = ReLU W atom x l + b atom, 2 where ReLU s the element-wse rectfed lnear unt, 21.e., ReLUx = max0, x, W atom R d d s the weght matrx to

More information

F statistic = s2 1 s 2 ( F for Fisher )

F statistic = s2 1 s 2 ( F for Fisher ) Stat 4 ANOVA Analyss of Varance /6/04 Comparng Two varances: F dstrbuton Typcal Data Sets One way analyss of varance : example Notaton for one way ANOVA Comparng Two varances: F dstrbuton We saw that the

More information

Small Area Interval Estimation

Small Area Interval Estimation .. Small Area Interval Estmaton Partha Lahr Jont Program n Survey Methodology Unversty of Maryland, College Park (Based on jont work wth Masayo Yoshmor, Former JPSM Vstng PhD Student and Research Fellow

More information

A Comparative Study for Estimation Parameters in Panel Data Model

A Comparative Study for Estimation Parameters in Panel Data Model A Comparatve Study for Estmaton Parameters n Panel Data Model Ahmed H. Youssef and Mohamed R. Abonazel hs paper examnes the panel data models when the regresson coeffcents are fxed random and mxed and

More information

Testing for seasonal unit roots in heterogeneous panels

Testing for seasonal unit roots in heterogeneous panels Testng for seasonal unt roots n heterogeneous panels Jesus Otero * Facultad de Economía Unversdad del Rosaro, Colomba Jeremy Smth Department of Economcs Unversty of arwck Monca Gulett Aston Busness School

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law:

Introduction to Vapor/Liquid Equilibrium, part 2. Raoult s Law: CE304, Sprng 2004 Lecture 4 Introducton to Vapor/Lqud Equlbrum, part 2 Raoult s Law: The smplest model that allows us do VLE calculatons s obtaned when we assume that the vapor phase s an deal gas, and

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Exerments-I MODULE III LECTURE - 2 EXPERIMENTAL DESIGN MODELS Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 2 We consder the models

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

Time-Varying Systems and Computations Lecture 6

Time-Varying Systems and Computations Lecture 6 Tme-Varyng Systems and Computatons Lecture 6 Klaus Depold 14. Januar 2014 The Kalman Flter The Kalman estmaton flter attempts to estmate the actual state of an unknown dscrete dynamcal system, gven nosy

More information

A New Method for Estimating Overdispersion. David Fletcher and Peter Green Department of Mathematics and Statistics

A New Method for Estimating Overdispersion. David Fletcher and Peter Green Department of Mathematics and Statistics A New Method for Estmatng Overdsperson Davd Fletcher and Peter Green Department of Mathematcs and Statstcs Byron Morgan Insttute of Mathematcs, Statstcs and Actuaral Scence Unversty of Kent, England Overvew

More information

Model Based Direct Estimation of Small Area Distributions

Model Based Direct Estimation of Small Area Distributions Unversty of Wollongong Research Onlne Centre for Statstcal & Survey Methodology Workng Paper Seres Faculty of Engneerng and Informaton Scences 2010 Model Based Drect Estmaton of Small Area Dstrbutons Ncola

More information

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes 25/6 Canddates Only January Examnatons 26 Student Number: Desk Number:...... DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR Department Module Code Module Ttle Exam Duraton

More information

Multivariate Ratio Estimator of the Population Total under Stratified Random Sampling

Multivariate Ratio Estimator of the Population Total under Stratified Random Sampling Open Journal of Statstcs, 0,, 300-304 ttp://dx.do.org/0.436/ojs.0.3036 Publsed Onlne July 0 (ttp://www.scrp.org/journal/ojs) Multvarate Rato Estmator of te Populaton Total under Stratfed Random Samplng

More information

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting.

x yi In chapter 14, we want to perform inference (i.e. calculate confidence intervals and perform tests of significance) in this setting. The Practce of Statstcs, nd ed. Chapter 14 Inference for Regresson Introducton In chapter 3 we used a least-squares regresson lne (LSRL) to represent a lnear relatonshp etween two quanttatve explanator

More information

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics )

Statistical Inference. 2.3 Summary Statistics Measures of Center and Spread. parameters ( population characteristics ) Ismor Fscher, 8//008 Stat 54 / -8.3 Summary Statstcs Measures of Center and Spread Dstrbuton of dscrete contnuous POPULATION Random Varable, numercal True center =??? True spread =???? parameters ( populaton

More information

Inductance Calculation for Conductors of Arbitrary Shape

Inductance Calculation for Conductors of Arbitrary Shape CRYO/02/028 Aprl 5, 2002 Inductance Calculaton for Conductors of Arbtrary Shape L. Bottura Dstrbuton: Internal Summary In ths note we descrbe a method for the numercal calculaton of nductances among conductors

More information

USE OF DOUBLE SAMPLING SCHEME IN ESTIMATING THE MEAN OF STRATIFIED POPULATION UNDER NON-RESPONSE

USE OF DOUBLE SAMPLING SCHEME IN ESTIMATING THE MEAN OF STRATIFIED POPULATION UNDER NON-RESPONSE STATISTICA, anno LXXV, n. 4, 015 USE OF DOUBLE SAMPLING SCHEME IN ESTIMATING THE MEAN OF STRATIFIED POPULATION UNDER NON-RESPONSE Manoj K. Chaudhary 1 Department of Statstcs, Banaras Hndu Unversty, Varanas,

More information

Convergence of random processes

Convergence of random processes DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large

More information

On mutual information estimation for mixed-pair random variables

On mutual information estimation for mixed-pair random variables On mutual nformaton estmaton for mxed-par random varables November 3, 218 Aleksandr Beknazaryan, Xn Dang and Haln Sang 1 Department of Mathematcs, The Unversty of Msssspp, Unversty, MS 38677, USA. E-mal:

More information

Lecture 6: Introduction to Linear Regression

Lecture 6: Introduction to Linear Regression Lecture 6: Introducton to Lnear Regresson An Manchakul amancha@jhsph.edu 24 Aprl 27 Lnear regresson: man dea Lnear regresson can be used to study an outcome as a lnear functon of a predctor Example: 6

More information

Small area estimation for semicontinuous data

Small area estimation for semicontinuous data Unversty of Wollongong Research Onlne Faculty of Engneerng and Informaton Scences - Papers: Part A Faculty of Engneerng and Informaton Scences 2016 Small area estmaton for semcontnuous data Hukum Chandra

More information

Lecture 2: Prelude to the big shrink

Lecture 2: Prelude to the big shrink Lecture 2: Prelude to the bg shrnk Last tme A slght detour wth vsualzaton tools (hey, t was the frst day... why not start out wth somethng pretty to look at?) Then, we consdered a smple 120a-style regresson

More information

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10)

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10) I. Defnton and Problems Econ7 Appled Econometrcs Topc 9: Heteroskedastcty (Studenmund, Chapter ) We now relax another classcal assumpton. Ths s a problem that arses often wth cross sectons of ndvduals,

More information