Model Based Direct Estimation of Small Area Distributions

Size: px
Start display at page:

Download "Model Based Direct Estimation of Small Area Distributions"

Transcription

1 Unversty of Wollongong Research Onlne Centre for Statstcal & Survey Methodology Workng Paper Seres Faculty of Engneerng and Informaton Scences 2010 Model Based Drect Estmaton of Small Area Dstrbutons Ncola Salvat Unversty of Psa, Italy Hukum Chandra Unversty of Wollongong, Ray Chambers Unversty of Wollongong, Recommended Ctaton Salvat, Ncola; Chandra, Hukum; and Chambers, Ray, Model Based Drect Estmaton of Small Area Dstrbutons, Centre for Statstcal and Survey Methodology, Unversty of Wollongong, Workng Paper 20-10, 2010, 28p. Research Onlne s the open access nsttutonal repostory for the Unversty of Wollongong. For further nformaton contact the UOW Lbrary: research-pubs@uow.edu.au

2 Centre for Statstcal and Survey Methodology The Unversty of Wollongong Workng Paper Model Based Drect Estmaton of Small Area Dstrbutons Ncola Salvat, Hukum Chandra and Ray Chambers Copyrght 2008 by the Centre for Statstcal & Survey Methodology, UOW. Work n progress, no part of ths paper may be reproduced wthout permsson from the Centre. Centre for Statstcal & Survey Methodology, Unversty of Wollongong, Wollongong NSW Phone , Fax Emal: anca@uow.edu.au

3 Model Based Drect Estmaton of Small Area Dstrbutons Ncola Salvat 1, Hukum Chandra 2 and Ray Chambers 3 1 Dpartmento d Statstca e Matematca Applcata all'economa, Unversty of Psa, Italy, E-mal: salvat@ec.unp.t 2 Centre for Statstcal and Survey Methodology, Unversty of Wollongong, Wollongong, Australa. E-mal: hchandra@uow.edu.au 3 Centre for Statstcal and Survey Methodology, Unversty of Wollongong, Wollongong, Australa. Emal: ray@uow.edu.au Summary Much of the small area estmaton lterature focuses on populaton totals and means. However, users of survey data are often nterested n the fnte populaton dstrbuton of a survey varable, and the measures (e.g. medans, quartles, percentles) that characterse the shape of ths dstrbuton at small area level. In ths paper we propose a model-based drect estmator (MBDE, see Chandra and Chambers, 2009) of the small area dstrbuton functon. The MBDE s defned as weghted sum of sample data from the area of nterest, wth weghts derved from the calbrated splne-based estmate of the fnte populaton dstrbuton functon ntroduced by Harms and Duchesne (2006), under an approprately specfed regresson model wth random area effects. We also dscuss the mean squared error estmaton of the MBDE. Monte Carlo smulatons based on both smulated and real datasets show that the proposed MBDE and ts assocated mean squared error estmator perform well when compared wth alternatve estmators of the area-specfc fnte populaton dstrbuton functon. Key words: Indcator functon; Model-based drect estmator; Mean squared error estmator; Smulaton experments. 1

4 { } 1. Introducton Let U = 1,2,..., N be the fnte populaton of sze N and let y denote a varable of nterest that takes values over ths populaton. A common target of nference s then the proporton of values y j that are bounded by a gven constant (e.g. the proporton of households whose monthly per capta expendture s below the poverty lne). More generally, the target of nference s the value of the fnte populaton dstrbuton functon for a varable y at a specfed value t. Ths s N F N (t) = N 1 I(y j t),.e. the proporton of the populaton j =1 whose values for y are less than or equal to t, where I(y j t) s the ndcator functon that takes the value 1 f y j t and 0 otherwse and t s a specfed constant. Clearly, once we obtan an estmator of the fnte populaton dstrbuton functon, we can evaluate ts nverse to obtan the assocated estmator of the fnte populaton quantle functon. See Chambers and Dunstan (1986), Rao et al. (1990), Harms and Duchesne (2006) and Rueda et al. (2007, 2010). Small area estmaton (SAE) s an mportant objectve of many surveys. Small areas or small domans are subsets of the populaton wth small sample szes, so standard survey estmaton methods for these areas, whch only use nformaton from the small area samples, are unrelable. In ths context SAE methods that borrow strength va statstcal models (Rao, 2003) can be used to produce relable small area estmates. However, vrtually all of these methods focus on estmaton of lnear parameters, e.g. small area means or totals. In ths paper we focus on estmaton of the small area dstrbuton of a study varable and measures (e.g. medans, quartles, percentles) that characterse the shape of ths dstrbuton. Ths s especally useful f there are extreme values n the small area sample data, or f the small area dstrbuton of the varable of nterest s hghly skewed (Tzavds et al., 2010). 2

5 We propose a model based drect estmator (MBDE) for the small area dstrbuton functon, extendng the MBDE approach (Chandra and Chambers, 2009) to the estmaton of the small area dstrbuton functon. Ths MBDE estmator s a weghted sum of the sample data from the small area of nterest, wth weghts that are derved from a splne-based calbrated estmator of the populaton dstrbuton functon (Harms and Duchesne, 2006) under a regresson model wth random area effects. The rest of the artcle s organzed as follows. The followng Secton descrbes SAE based on the lnear mxed model and the nonparametrc regresson model based on penalzed splnes and then uses these models to motvate estmators of the small area dstrbuton functon. Secton 3 ntroduces the concept of calbrated sample weghts for a fnte populaton dstrbuton functon and uses these to defne the MBDE estmator for ths functon. A basrobust estmator of the mean squared error of the MBDE s also developed, based on the approach of Chambers et al. (2009). The emprcal performances of the proposed MBDE as well as alternatve estmators of the small area dstrbuton functon are evaluated n Secton 4, usng both model-based and desgn-based smulatons, wth the desgn-based smulatons based on two real data sets. Concludng remarks are set out n Secton Estmaton of the Small Area Dstrbuton Functon We assume that a fnte populaton U contanng N unts can be parttoned nto A nonoverlappng domans, referred to from now on as small areas, or smply areas, ndexed by N A N =1 = 1,..., A, wth area contanng unts, so N =. Let denote the value of the y j varable of nterest y for unt j ( j = 1,, N ) n area ( = 1,, A ). The area-specfc dstrbuton functon of y for area s N 1 F (t) = N I(y j t). (1) j =1 3

6 Let s denotes a sample of n unts drawn from U by some specfed samplng desgn, and assume that values of the varable of nterest y are avalable for each of these n sample unts. The non-sample component of U, contanng N - n unts, s denoted by r. In what follows, we use a subscrpt of to denote quanttes specfc to area ( = 1,..., A). For example, s and r denote the n sample and N n non-sample unts respectvely for area. Wth ths notaton, the conventonal estmators of the area dstrbuton functon, F (t), are the Horvtz- Thompson (HT) estmator ˆF HT 1 (t) = N π 1 j I(y j t), (2) j s and the Hajek estmator ˆF Hajek (t) = π 1 j I(y j t) j s 1 π j s j. (3) Here π j denotes the sample ncluson probablty of unt j. Both (2) and (3) are area-specfc desgn-based drect estmators and do not depend on an assumed model for ther valdty (Cochran, 1977). Unfortunately, emprcal evdence presented n Rueda et al. (2007) shows that these estmators can be substantally based, whle the fact that they only use nformaton from the area sample makes them too unstable for SAE. Model-based small area estmators based on the lnear mxed model are wdely used n SAE. However, f the functonal form of the regresson relatonshp between the varable of nterest and the avalable auxlary varables s unknown or has a complcated functonal form, then SAE based on the use of a nonparametrc regresson model can offer sgnfcant advantages compared wth one based on a lnear model. In partcular, a nonparametrc regresson model based on p-splnes s attractve because t represents a relatvely straghtforward extenson of a lnear regresson model (Elers and Marx, 1996). Opsomer et al. (2008) descrbe the use of a splne-based nonparametrc regresson model for SAE. See also Salvat et al. (2010). In the rest of ths Secton we therefore summarze the model-based 4

7 approach to estmaton of the small area dstrbuton functon under the lnear mxed model and under a nonparametrc regresson model. 2.1 Estmaton under the lnear mxed model SAE theory for ths case s now well establshed, see Rao (2003). We brefly descrbe t below snce ths allows us to ntroduce notaton that wll be used elsewhere n the paper. To start, we note that throughout ths paper we wll assume that we have access to the populaton values of p auxlary scalar varables that are, to a greater or lesser extent, correlated wth y. Let x j denote the vector of values of these auxlary varables that are assocated wth y j and let z j denote a vector of auxlary contextual varables whose values are known for all unts n the populaton. Let y U, X U and Z U denote the populaton level vector and matrces defned by y j, x j and z j, respectvely. Then the lnear mxed model s y U = X U β + Z U u + e U, (4) where β s a p vector of regresson coeffcents, u s a random vector of area effects and e U s a populaton N-vector of random ndvdual effects. In general, area effects are vectorvalued, so ( 1 2 A ) u T = u T, u T, u T and Z = dag { Z ; = 1,, A}, where ndexes the A areas U that make up the populaton and N Z s of dmenson q. The area effects { u ; 1,, A } = are assumed to be ndependent and dentcally dstrbuted realsatons of a random vector of dmenson q wth zero mean and covarance matrx Σ u. Smlarly, the scalar ndvdual effects makng up e U are assumed to be ndependent and dentcally dstrbuted realsatons of a random varable wth zero mean and varance σ e 2, wth area and ndvdual effects mutually ndependent. The covarance matrx of the vector s then y U Var(y U ) = I k V U = Z U Σ u Z T U + σ 2 e I N, where denotes the dentty matrx of dmenson k. The parameters θ = (Σ u,σ 2 e ) are typcally referred to as the varance components of (4). 5

8 We also assume throughout ths paper that the method of samplng s non-nformatve gven the auxlary varables, so the model (4) holds for both sampled and non-sampled populaton unts. Consequently, we can partton y U, X U, Z U and e U nto components defned by the n sampled and N n non-sampled populaton unts, denoted by subscrpts of s and r respectvely, and re-express (4) as follows: y U = y s y r = X s X r β + Z s Z r u + e s e r, wth the varance of y smlarly parttoned, V U = V ss V rs V sr V rr. Thus X s represents the matrx defned by the n sample values of the auxlary varable vector, whle 2 { ; 1,, } { T σ ; 1,, } V = dag V = A = dag Z Σ Z + I = A ss ss s u s e s and { ; 1,, } { T ; 1,, } V = dag V = A = dag Z Σ Z = A. sr sr s u r Here Z s and Z r respectvely denote the restrcton of Z to sampled and non-sampled unts n area. The dstrbuton functon for small area gven by (1) can be expressed as 1{ j s j r } F ( t ) = N I ( y t ) + I ( y t) j j y j, where the frst term on the left s known and the second s unknown. The problem of estmatng F (t) therefore reduces to predctng the values for the non-sample unts n area. Gven estmated values ˆθ = ( ˆΣ u, ˆσ 2 e ) of the varance components we can defne the estmated covarance matrx ˆV U, and the predcted ŷ EBLUP T values of are j = x ˆβEBLUE j + z jt û EBLU P, where ˆβ EBLUE T X s ) 1 T =(X ˆV 1 X s ˆVss 1 s ss y s s the y j 6

9 emprcal best lnear unbased estmator (EBLUE) of β and û EBLUP = ˆΣ T u Z s ˆVss 1 (y s X s ˆβ) s the emprcal best lnear unbased estmator (EBLUP) of u. Substtutng estmated values for the parameters of (4) then allows us to defne an estmator for F (t) of the form EBLUP 1{ ( ˆ ) j s I j r j t } ˆ EBP F () t = N I( y t) + y. (5) j We refer to (5) as the emprcal best predctor or EBP. An alternatve way of predctng F ( t) s va the Chambers and Dunstan (hereafter CD) estmator. See Chambers and Dunstan (1986) for detals. Snce the wthn area resduals are homoskedastc under (4), the CD estmator of F (t) can be wrtten EBLUP { ( ) j s ˆ ˆEBLUP j r k s y k k { + y y t}} ˆ CD 1 1 F () t = N I( yj t) + n I j. (6) Note that the CD estmator s asymptotcally unbased f (4) s correctly specfed. 2.2 Estmaton under a nonparametrc mxed model The CD estmator (6) wll be based f the functonal form of the relatonshp between the response varable and the auxlary varables (.e. the regresson functon) s not lnear or the varance term n the regresson model s msspecfed (Tzavds et al., 2010). Ths susceptblty of parametrc model-based methods to msspecfcaton bas provdes motvaton for the use of alternatve non-parametrc model-based methods. We now summarze applcaton of the p-splne nonparametrc regresson model to SAE (Opsomer et al., 2008), and, for smplcty, consder the unvarate case. The underlyng regresson model s then y = m( x ) + e, where are ndependent random varables wth zero means. The j j j e j functon m( x) s unknown and assumed to be approxmated suffcently well by b mx (, βγ, ) = β + β x+ + β x + γ ( x κ ) b, (7) 0 1 p K k = 1 k k + 7

10 b b where b s the degree of the splne, ( c) = c I( c > + b), κ k s a set of fxed constants called knots for k = 1,..., K, β = (β 0,...,β p )T s the coeffcent vector of the parametrc part of the model and γ = (γ 1,...,γ K ) T s the vector of splne coeffcents. The approxmatng functon m(x,β,γ ) n (7) uses truncated polynomal bass functons for smplcty and, f the number of knots K s suffcently large, can approxmate most smooth functons. Ruppert et al. (2003, Chapter 5) suggest the use of a knot for every four observatons, up to a maxmum of about 40 knots for a unvarate applcaton. Usng a large number of knots n (7) can lead to an unstable ft. In order to overcome ths problem, an upper lmt s usually mposed on the sze of the splne coeffcent vector γ. Estmatng β and γ by mnmzng the squared devatons of model (7) from the actual data values subject to ths constrant s equvalent to mnmzng the penalzed loss functon ( (,, )) 2 j j T y m x β γ + λγ γ. (8) j Here λ s a Lagrange multpler that controls the level of smoothness of the resultng ft. Wand (2003) and Ruppert et al. (2003, Chapter 4) note the equvalence between mnmzng (8) and maxmzng the lkelhood of the response varable under the lnear model (7) where the splne coeffcents are treated as random effects. In partcular, let y U = ( y 1, y 2,..., y N ) T, X U b 1 x1 x 1 = b 1 xn x N and p b ( x1 κ1) + ( x1 κk ) + Δ U =. p b ( xn κ1) + ( xn κk) + The splne approxmaton (7) can then be wrtten as the lnear mxed model y U = X U β + Δ U γ + e U, (9) where γ and e are now assumed to be ndependent Gaussan random vectors of dmenson K and N respectvely. In partcular, t s assumed that 8

11 γ ~ N(0,σ 2 γ I K ) and e U ~ N(0,σ 2 e I N ). Opsomer et al. (2008) adapt p-splnes to the SAE context by addng area random effects to (9), whch then becomes y U = X U β + Δ U γ + Z U u + e U, (10) where, as n Secton 2.1, Z = ( Z,, Z ) T U 1 N s a matrx of known covarates of dmenson N A charactersng dfferences among the areas and u s the A-vector of random area effects. In the smplest case, Z U s gven by a matrx whose -th column, for = 1,, A, s an ndcator varable that takes the value 1 f a unt s n area and s zero otherwse. It s assumed that the area effects are dstrbuted ndependently of the splne effects γ and the ndvdual effects e, wth u ~ N(0, Σ u ), so that the covarance matrx of the vector s y U Var(y Z U Σ u Z T U + σ 2 U ) = V = σ 2 γ Δ U Δ T U + e I N. The varance components of (10) are then gven by 2 2 ( γ, u, e ) θ = σ Σ σ. Note that, as n prevous Secton, the use of non-nformatve samplng gven the auxlary varables means that (10) also holds at the sample level. When the varance components are known, well-establshed theory (McCulloch and Searle, 2001, Chapter 9) leads to the generalsed least squares estmator of β,.e. ˆβ =(X T s V 1 ss X s ) 1 X T s V 1 ss y s, and the best lnear unbased predctors (BLUPs) for γ and u,.e. ˆγ =σ γ 2 Δ s T V ss 1 (y s X s ˆβ) and û = Σ u Z T s V 1 ss (y s X s ˆβ). In practce, the varance components are unknown and must be estmated from sample data usng methods such as maxmum lkelhood or restrcted maxmum lkelhood; see Harvlle (1977). In what follows we use 2 2 ( ˆ σ, ˆ, ˆ γ Σu σ e ) to denote such estmates, allowng us to defne the plug-n estmator ˆV ss = ˆσ γ 2 Δ s Δ s T + Z s ˆΣu Z s T + ˆσ 2 e I n, where I n s the dentty matrx of order n. Ths leads to the nonparametrc model-based EBLUE for β,, and to the ˆβ NPEBLUE =(X s T ˆVss 1 X s ) 1 X s T ˆVss 1 y s 9

12 correspondng nonparametrc EBLUPs (NPEBLUPs) for the splne and area effects n (10), ˆγ NPEBLUP = ˆσ γ 2 Δ s T ˆVss 1 (y s X s ˆβ NPEBLUE ) and û NPEBLUP = ˆΣ T u Z s ˆVss 1 (y s X s ˆβ NPEBLUE ). Under (10), the nonparametrc emprcal best predctor of the dstrbuton functon for area (denoted by NPEBP) s 1 { ˆ NPEBP F () t = N NPEBLUP I( y t) + I( yˆ t) j s j ŷ NPEBLUP T j = x j ˆβ NPEBLUE + δ jt ˆγ NP EBLUP + z jt û NPEBLUP T where, and x j, and denote j r j δ j T }, (11) z j T respectvely the rows of, and that correspond to unt j n area. Smlarly, under X U Δ U Z U (10), the nonparametrc verson of the CD estmator of the dstrbuton functon for area s { } ˆ NPCD 1 1 NPEBLUP NPEBLUP F () t = N I( y ) n ˆ ( ˆ I yj yk y j t + + k ) t j s j r k s. (12) 3. The Model-Based Drect Estmator for the Small Area Dstrbuton Functon A drect estmate for a small area s smple to nterpret, snce the estmated value of the varable of nterest for the area s just a weghted average of the sample data from the same area. Ths s not true of an ndrect estmator lke the EBLUP, whch s a weghted sum over the entre sample. Unfortunately, when weghts are the nverses of sample ncluson probabltes, conventonal drect estmators lke (2) and (3) can be qute neffcent. The Model-Based Drect Estmator (MBDE) of a small area mean mproves upon the effcency of these conventonal drect estmators by usng the weghts that defne the EBLUP for the populaton total under a model wth random area effects. See Chandra and Chambers (2009) and Salvat et al. (2010). MBDEs for the populaton mean of y usng weghts based on the lnear model (4) as well as those based on the non-parametrc model (10) are therefore possble. However, the fnte populaton dstrbuton functon s the populaton mean of an ndcator varable, whch does not satsfy ether (4) or (10). Consequently, 'standard' EBLUP 10

13 weghts are not approprate for defnng the MBDE of ths functon. Instead, we use sample weghts that are calbrated to the known fnte populaton dstrbuton of the auxlary varables n x and are based on a model wth random area effects. For smplcty, we restrct our dscusson below to a sngle scalar covarate x, notng that the extenson to multple scalar covarates s straghtforward. The calbrated estmator of a fnte populaton dstrbuton functon F N (t) was defned n Harms and Duchesne (2006) as a weghted emprcal dstrbuton functon ˆF HD N (t) = N 1 w j I(y j t) (13) j s where the sample weghts w n (13) are calbrated to the known fnte populaton j dstrbuton of x. In partcular, let 0< α < α < < αk < denote an ordered set of constants. Then the weghts used n (13) sum to N and, for k = 1,, K, also satsfy { ( α )} = α, (14) wi x Q N j s j j x k k where Q x (α k ) s the known α k -quantle of the fnte populaton dstrbuton of x. That s, the weghts used n (13) are calbrated to both the populaton sze N and to the populaton totals of the auxlary varables defned by the ndcators I{ xj Qx( αk) }. Standard results from calbraton theory (Devlle and Särndal, 1992; Chambers, 1996) can be used to show that f these calbrated weghts w j are then chosen to mnmse ther chsquare dstance from the weghts used n Horvtz-Thompson estmator (2), as s commonly done, then (13) s a regresson estmator of F N (t) under the lnear model { } K I( y t) = β + β I x Q ( α ) + jt ε, (15) j 0t kt j x k k = 1 where the ε jt are uncorrelated errors wth zero expectaton and varance 2 σ εt (Chambers, 2005). However, (15) s also easly seen to be a p-splne model wth knots at the α k -th 11

14 quantles of the fnte populaton dstrbuton of x. That s, ˆF HD N (t) s actually a p-splne estmator of F N (t). Defne g jk I{ xj Qx ( αk )} = and let = ( g ; j = 1,..., N) g Uk jk be the correspondng populaton N-vector, so G = [ 1, g,, g ] U N U1 UK denotes the populaton level matrx of values of these varables, where 1 denotes a N-vector of ones. Also, defne d jt = I(y j t) and put d Ut equal to the N-vector of populaton values of the d jt. The populaton level verson of model (15) s then N d Ut = G U β t + ε Ut. (16) Gven the approprate sample and non-sample components of d, and the covarance Ut G U matrx V Ut = σ 2 I εt of ε DF, the vector of sample weghts w that defne the EBLUP of the U Ut jt populaton total of the d jt under (16) s then DF ( w j ) DF T T T 1 ; ˆ ( ˆ T T ) ( ) ˆ w ˆ st = jt s = 1n + Hst GU 1N Gs 1n + In Hstgs VsstVsrt1 N n, (17) where ˆ T ( ˆ st = Gs VsstGs ) H 1 1 T ˆ 1 Gs Vsst. Under (16), ˆVsst = ˆσ 2 εt I n and ˆVsrt = 0, so these weghts smplfy to T T T ( wj ; j ) = n +Gs( GsGs) ( GU1N Gs1n) = 1n + G ( G ) DF DF 1 T 1 T s = s 1 s s s N w G G n 1. N n The model (16) s easly adapted to small area estmaton by ncludng random area effects. That s, we replace (16) by d Ut = G U β t + Z U u t + ε Ut (18) where Z U was defned followng (4) and u t ~ N(0, Ω t ) s an A-vector of random area effects. As usual, we assume that u t and ε Ut are ndependently dstrbuted, so that Var(d T Ut ) = V Ut = Z U Ω t Ζ + σ U 2 εti N DF. The sample weghts w that defne the EBLUP of the jt populaton total of the d under (18) are then stll gven by (17), but now wth jt 12

15 ˆV sst = Z s ˆΩ t Z T s + ˆσ 2 εt I n varance components of (18). 0< 1 < 2 < < K T 2 and ˆV srt = Z s ˆΩ t Z r, where ˆΩ t and ˆσ εt are the estmated values of the In practce, one frst needs to decde on the calbraton constrants (14) before (18) can be ftted and (17) calculated. Ths n turn requres that one has chosen the values α α α < 1. We adapt the ordered half-sample cross valdaton procedure descrbed n Chambers (2005) for ths purpose. In partcular, we fx K = 1 and then search for α t opt the value that maxmses the concordance between the sample values of and the d jt sample values of j = { x ( ) j Qx } g I α. The steps n ths procedure are as follows: 1. Order the sample x-values: x (1), x (2), x(3),..., x (n 1), x (n) ; 2. Create two sets E = { x(1), x (3),...} and = {,,... (2) (4) } V x x ; 3. For gven α and t, ft the model (18) and then compute the weghts (17), treatng E as the 'sample' and V as the 'nonsample'. Denote the correspondng value of (13) based on these weghts by ˆF HD(n) N (t,α); 4. The optmal value α t opt then satsfes ( ( n) 1 { t } mn { FN n I( yj t) j s } HD n) opt Fˆ 1 N (, t t ) n I( yj ) = ˆ HD (, t ) j s 2 2 α α. 0< α < 1 We note that although ths procedure only dentfes a sngle 'most concordant' calbraton constrant to use n (14), there s nothng to stop t beng extended to dentfcaton of multple calbraton constrants. However, some care must then be taken to ensure that the resultng values of Q x ( α) are separated suffcently n the nterval spanned by the sample values of the auxlary x. Falure to do ths could result n the sample desgn matrx defned by (18) not beng of full rank. Fnally, gven the weghts (17), we wrte down the MBDE for the area dstrbuton functon F (t) as 13

16 ˆF MBDE (t) = w DF jt I(y j t) j s. (19) DF w j s jt We refer (19) as a drect estmator because t s a weghted average of the sample data from the area of nterest. However, ths does not mean that t can be calculated from these data alone. The weghts (17) are a functon of the data from the entre sample. That s, they borrow strength from other areas va the model (18). It should also be ponted out that snce the weghts (17) depend on t, there s no guarantee that (19) defnes a monotone functon of t,.e. one where t 1 < t 2 mples ˆF MBDE (t 1 ) ˆF MBDE (t 2 ). Ths ssue wll usually not be relevant when one wshes to estmate the dstrbuton of nterest at ponts that are well separated, but can be a problem when the am s to nvert (19) as a functon of t n order to estmate quantles. In such a stuaton we recommend that (19) be frst transformed to be monotone n t, e.g. usng the approach descrbed n He (1997). 3.1 Mean squared error estmaton for the MBDE A bas-robust estmator of the mean squared error (MSE) of the MBDE s descrbed n Chandra and Chambers (2009), see also Chambers et al. (2009), and we use ths approach here to defne a correspondng MSE estmator for (19). Ths s the estmator { MBDE 2 ()} t t Mˆ Fˆ t = Vˆ + Bˆ (20) where ˆV t s a heteroskedastcty-robust estmator of the condtonal predcton varance of MBDE ˆF (t) (Royall and Cumberland, 1978), ˆBt s an estmator of the correspondng condtonal predcton bas, and the condtonng s wth respect to the value of the area effect. In partcular, we use DF 2 {( ) } V ˆ N N w 1 ( N n ) n ( d ˆ μ 2 = + ), (21) 2 ( ) 1 t j s jt jt jt 14

17 where w jt DF() = w jt DF and ˆμ jt s an unbased lnear estmator of the condtonal DF w k s kt expected value μ jt = E(d jt g j,u t ). Chambers et al. (2009) recommend that ˆμ jt be computed as the unshrunken verson of the EBLUP for μ jt,.e. 1 ( ) ( ) T ˆ ˆ T T T ˆ T T jt = 0t + g j 1t + j s s s s st s n ˆ μ β β z Z Z Z I H g l. For the condtonal bas of the MBDE, we use a smple plug-n estmator of the form DF() 1 ˆB t = w jt ˆμ jt N ˆμ jt. (22) j s Note that the MSE estmator (20) gnores the extra varablty assocated wth estmaton of the varance components, and s therefore a heteroskedastcty-robust frst order approxmaton to the actual condtonal MSE of the MBDE. Also, (20) treats the weghts (17) as fxed,.e. t gnores the contrbuton to the MSE from the estmated varance components. Chambers et al. (2009) refer to ths as a pseudo-lnearzaton assumpton snce for large overall sample szes the contrbuton to the overall MSE of (19) arsng from the varablty of varance components wll be of smaller order of magntude then the fxed weghts predcton varance estmated by (21). However, the extent of ths underestmaton wll depend on the small area sample szes and the characterstcs of the populaton of nterest, partcularly the strength of the small area effects. Fnally, we note that (22) s a conservatve estmator of the j U squared bas, snce ˆ2 ( ) ( ˆ 2 t t ) ( ˆ t ) E B = Var B + E B. However, the extent of ths overestmaton s typcally very small. 4. Emprcal Evaluatons In ths Secton we report the results from model-based and desgn-based smulaton studes that llustrate the performance of the dfferent estmators of the small area dstrbuton functon defned n the precedng two Sectons. These estmators are set out n Table 1. Ther 15

18 performance n the smulaton studes s evaluated by computng for each small area the absolute relatve bas (ARB), the relatve root mean squared error (RRMSE) and coverage rate (CR) of nomnal 95 per cent confdence ntervals defned as follows: { } ( ) 1 R 1 1 R r= 1 r= 1( ), ARB = R F R F ˆ F 100 r r r 1 R 1 R ( r= 1 ) r= 1( ) RRMSE R F 1 R F 2 ˆ F = r r r 100, and ( ) R 1 CR = I Fˆ F 2 Mˆ 100. r r r R r = 1 Here R denotes the number of smulatons, F r denotes the true value of the area dstrbuton ˆF r functon at smulaton r, denotes an estmate of ths value, and denotes an estmate of 1 R the MSE of ˆF r. The value of the true MSE for ˆFr s calculated as R ( Fˆ r F ˆM r r= 1 r ) 2. Note that n the desgn-based smulatons F r = F. 4.1 Model-based smulatons In the model-based smulatons we set A = 30 and use two types of models to generate the populaton values of y. The frst s a lnear model, y j = x j + u + e j, where x j ~ χ 2 (20), j = 1,..., N and = 1,..., A, wth random area effects are generated as ( ) ( 94.09) ndependent realzatons from a N 0, dstrbuton and e j dstrbuted as N 0,, u correspondng to an ntra-area correlaton of σ u 2 σ 2 2 ( u + σ ε )= 0.2. Smulatons based on ths model are referred to as set 1 smulatons. The second model s a multplcatve model, y j = 5x β j u e j, where the values of x j are ndependently drawn from the lognormal dstrbuton log( x ) N j ( 2 6, σ x ), and the ndvdual effects and area effects are ndependently 2 2 drawn as log( ej ) N( 0,σ e ) and log( u) N( 0, σ u ) respectvely. We use two sets of parameters for ths model, defned by β (1 or 2), σ u (0.4 or 0.6), σ e (0.7 or 1.0) and σ x (

19 or 1.20). These are referred to from now on as set 2a and set 2b. Data values for y generated under set 2a are almost lnear n x whle those generated under set 2b are qute non-lnear n x. The small area populaton szes are randomly drawn from a unform dstrbuton on N [450,550] and kept fxed over the smulatons. The small area sample szes n are determned by frst selectng a smple random sample of sze n =600 from the populaton and notng the resultng sample szes n each small area. These area specfc sample szes n are then fxed n the smulatons by treatng the small areas as strata and carryng out stratfed random samplng. A total of R = 1000 smulatons are then carred out for each combnaton of model and ndvdual error dstrbuton, wth each smulaton correspondng to frst generatng the populaton values and then drawng a sample. The average ARB values and the average RRMSE values of the dfferent small area dstrbuton functon estmators are shown n Table 2 and 3 respectvely. These values are n percentage terms, and the averages are over the 30 small areas. All estmators are evaluated at the 0.1, 0.25, 0.5, 0.75 and 0.9 quantles of y. 4.2 Desgn-based smulatons The desgn-based smulatons are based on two real survey data sets. The frst survey data set s based on data collected n the Australan Agrcultural Grazng Industry Survey (AAGIS) conducted by the Australan Bureau of Agrcultural and Resource Economcs. In the orgnal sample there were 759 farms from 12 regons (the small areas of nterest), whch make up the wheat-sheep zone for Australan broadacre agrculture. We used these sample data to generate a synthetc populaton of sze N = 39,562 farms by re-samplng the orgnal AAGIS sample of n = 759 farms wth probablty proportonal to a farm s sample weght. Ths fxed populaton was then repeatedly sampled usng stratfed random samplng wth regons correspondng to strata and wth stratum sample szes the same as n the orgnal sample. The varable of nterest s total cash costs (TCC) and the auxlary varable s land area. Based on the orgnal AAGIS sample data, the ft of the lnear mxed model (AIC = 17

20 ) and the ft of the nonparametrc p-splne regresson model (AIC = ) were essentally the same, ndcatng that addton of the nonparametrc splne component does not mprove the ft of the mxed model. We therefore do not expect to see much dfference between the dstrbuton functon estmates generated by these two models. The am s to estmate the values of the regonal dstrbuton functons at the 0.1, 0.25, 0.5, 0.75 and 0.9 quantles of the fnte populaton dstrbuton of TCC. The data for the second desgn-based smulaton come from the Envronmental Montorng and Assessment Program (EMAP) survey carred out by the Space Tme Aquatc Resources Modellng and Analyss Program (STARMAP) at Colorado State Unversty, and we replcate the desgn-based smulaton experment carred out by Salvat et al. (2010). The background to ths data set s that EMAP conducted a survey of lakes n the North-Eastern states of the Unted States of Amerca between 1991 and The data collected n ths survey ncluded 551 measurements of Acd Neutralzng Capacty (ANC) - an ndcator of the acdfcaton rsk of water bodes n water resource surveys - from a sample of 349 of the 21,028 lakes located n ths area. Here we defne lakes grouped by 6-dgt Hydrologc Unt Code (HUC) as our small areas of nterest. Snce three HUCs have sample szes of one, these are combned wth adjacent HUCS, leadng to a total of 23 small areas. Sample szes n these 23 areas vary from 2 to 45. A (fxed) pseudo-populaton of N = 21,028 lakes s defned by samplng N tmes wth replacement and wth probablty proportonal to a lake's sample weght from the orgnal sample of 349 lakes. A total of R = 1000 ndependent stratfed random samples of the same sze as the orgnal sample are selected from ths pseudopopulaton, wth HUCs correspondng to strata and stratum sample szes fxed to be the same as n the orgnal sample. The survey varable of nterest s the ANC value of a lake, wth ts elevaton defnng the auxlary varable. Usng the orgnal EMAP data, the ft of the lnear mxed model (AIC = ) s worse than that of the nonparametrc regresson model (AIC 18

21 = ). In ths case, therefore, there are gans from ncludng the splne component n the mxed model, and so we expect that estmates of the dstrbuton functon based on the nonparametrc regresson model wll perform better than those based on the lnear mxed model. Agan, the am s to estmate the values of the ndvdual HUC dstrbuton functons at the 0.1, 0.25, 0.5, 0.75 and 0.9 quantles of the fnte populaton dstrbuton of ANC. Tables 4 and 5 show the average over small areas of the ARB and RRMSE values of the dfferent dstrbuton functon estmators based on the R = 1000 ndependent stratfed samples taken from the AAGIS and EMAP populatons respectvely. Smlarly, Table 6 shows the correspondng averages over the areas of the true RMSEs and estmated RMSEs, and the actual coverage rates of nomnal 95 percent confdence ntervals for the true areaspecfc dstrbuton functon values based on the MBDE estmator (19) and ts assocated MSE estmator (20). Fgures 1 and 2 show the area-specfc values of the true RMSE and estmated RMSE of the MBDE (19) for the desgn-based smulatons of the AAGIS and EMAP data. 4.3 Dscusson Two thngs stand out n Tables 2 and 3. The frst s that the MBDE offers substantal bas gans over the other DF estmators, at all quantles, when the relatonshp between the study varable and the covarate s complcated and/or the usual mxed model dstrbutonal assumptons are nvald (sets 2a and 2b). If the underlyng populaton structure s lnear and the usual mxed model assumptons hold (set 1) the CD and NPCD estmators have slghtly smaller absolute bases than the MBDE. The larger bases of the 'plug-n' EBP and NPEBP estmators are not unexpected n set 1 because these estmators gnore unt level varablty n y. Second, the NPCD estmator generally records the lowest RRMSE among the alternatves to the MBDE, but when the relatonshp between y and x s complcated, as under sets 2a and 2b, the RRMSE values recorded by the MBDE are comparable, and sometmes lower, than 19

22 those recorded by the NPCD estmator. On the other hand, under the lnear specfcaton (set 1), the MBDE s clearly less effcent than ts alternatves. Desgn-based smulatons serve to complement model-based smulatons for SAE, provdng evdence of comparatve performance and robustness n realstc data scenaros. Table 4 shows the results for the desgn-based smulatons usng the AAGIS data. Here we see that the MBDE has lower bas and RMSE than the other predctors at all quantles. As expected, gven the lnear relatonshp between y and x, the CD-based estmators of the DF based on the lnear mxed model are generally more effcent than those based on the nonparametrc splne regresson model. However, the reverse s true for the EBP-based estmators, perhaps reflectng the lower (but stll substantal) bases of the NPEBP. Table 5 reports the desgn-based smulaton results for EMAP data. These agan ndcate that the MBDE domnates the other estmators n terms of bas. The results for RRMSE are not as clear-cut as n the AAGIS smulatons, but stll show that the performance of the MBDE s comparable wth the performance of the NPCD estmator, whch was consstently the best of the alternatve estmators n terms of RRMSE. We now turn to an examnaton of the performance of the MSE estmator (20) for the MBDE. Fgures 1 and 2 show that ths estmator accurately tracks the smulaton (.e. repeated samplng) area-specfc MSEs of the MBDE at all fve target quantles for y. Ths good performance s confrmed by the results n Table 6, whch shows that the area averages of the true RMSEs and the estmated RMSEs obtaned usng (20) are very close. Fnally, we note that one can combne the MBDE estmator (19) wth the MSE estmator (20) to generate normal theory confdence ntervals for the area-specfc value of the dstrbuton functon,.e. as the small area estmate plus or mnus twce ts correspondng estmated RMSE. Table 6 shows that the actual coverage rates acheved by these ntervals, though generally less than 95 per cent, are stll close enough to ther target value to be practcally useful. 20

23 Fnally, we note that an alternatve to the CD estmator that s both model-consstent and desgn-consstent, has been proposed by Rao et al. (1990). Although the relevant results are not reported here, we also explored the performance of both parametrc and nonparametrc versons of ths estmator n our smulatons. In all cases, ths performance was almost dentcal to that of the parametrc and nonparametrc versons of the CD predctor. 5. Conclusons Ths paper develops an MBDE estmator for the value of the area-specfc fnte populaton dstrbuton of a response varable y. Ths estmator s based on sample weghts that are calbrated to the fnte populaton dstrbuton of an auxlary varable x, and also allow for random area effects. We then compare the performance of ths MBDE estmator wth two competng estmators based on ether a lnear mxed model or a nonparametrc mxed model for y. Our results ndcate that the proposed MBDE can sometmes be much better than these alternatves, partcularly n realstc applcatons where ftted models are approxmatons at best. On the other hand, f the model assumptons are vald (e.g. set 1 n the model-based smulatons), then area-specfc dstrbuton functon estmators based on the CD representaton are preferable. We also provde a method for estmatng the MSE of the MBDE and demonstrate emprcally that t performs well. References Chambers, R. (1996). Robust case-weghtng for multpurpose establshment surveys. Journal of Offcal Statstcs, 12, Chambers, R. (2005). Imputaton vs. Estmaton of Fnte Populaton Dstrbutons. Southampton Statstcal Scences Research Paper. S3RI Methodology Workng Papers, M05/06. 21

24 Chambers, R., Chandra, H. and Tzavds, N. (2009). On Bas-Robust Mean Squared Error Estmaton for Lnear Predctors for Domans. Workng Papers, Centre for Statstcal and Survey Methodology, The Unversty of Wollongong, Australa. (Avalable from: Chambers, R. and Dunstan, R. (1986). Estmatng dstrbuton functons from survey data. Bometrka, 73, Chandra, H. and Chambers, R. (2009). Multpurpose weghtng for small area estmaton. Journal of Offcal Statstcs, 25, 3, Cochran, W.G. (1977). Samplng Technques, 3rd edton. Wley & Sons, NY. Devlle, J.C. and Särndal, C.E. (1992). Calbraton estmators n survey samplng. Journal of the Amercan Statstcal Assocaton, 87, Elers, P. and Marx, B. (1996). Flexble Smoothng usng B-splnes and Penalzed Lkelhood (wth comments and rejonder). Statstcal Scence, 11, Harms, T. and Duchesne, P. (2006). On calbraton estmaton for quantles. Survey Methodology, 32, Harvlle, D.A. (1977). Maxmum lkelhood approaches to varance component estmaton and to related problems. Journal of the Amercan Statstcal Assocaton, 72, He, X. (1997). Quantle curves wthout crossng. Amercan Statstcan, 51, McCulloch, C.E., and Searle, S.R. (2001). Generalzed Lnear and Mxed Models. Wley, New York. Opsomer, J.D., Claeskens, G., Ranall, M.G., Kauermann, G. and Bredt, F.J. (2008). Nonparametrc small area estmaton usng penalzed splne regresson. Journal of the Royal Statstcal Socety, Seres B, 70, Rao, J.N.K., Kovar, J.G. and Mantel, H.J. (1990). On estmatng dstrbuton fucntons and quantles from survey data usng auxlary nformaton. Bometrka, 77,

25 Rao, J.N.K. (2003). Small Area Estmaton. New York: Wley. Rueda, M., Martínez, S., Martínez, H. and Arcos, A. (2007). Estmaton of the dstrbuton functon wth calbraton methods. Journal of Statstcal Plannng and Inference, 137, Rueda, M., Sánchez-Borrego, I., Arcos, A. and Martínez, S. (2010). Model-calbraton estmaton of the dstrbuton functon usng nonparametrc regresson. Metrka, 71, Ruppert, D., Wand, M.P. and Carroll, R. (2003). Semparametrc Regresson. Cambrdge Unversty Press, Cambrdge. Royall, R.M. (1976). The lnear least-squares predcton approach to two-stage samplng. Journal of the Amercan Statstcal Assocaton, 71, Royall, R.M. and Cumberland, W.G. (1978). Varance estmaton n fnte populaton samplng. Journal of the Amercan Statstcal Assocaton, 71, Salvat, N., Chandra, H., Ranall, M.G. and Chambers, R. (2010). Small area estmaton usng a nonparametrc model-based drect estmator. Computatonal Statstcs and Data Analyss, 54, Tzavds, N., Marchett, S., and Chambers, R. (2010). Robust predcton of small area means and quantles. Australan and New Zealand Journal of Statstcs, 52, Wand, M.P. (2003). Smoothng and mxed models. Computatonal Statstcs, 18,

26 Table 1. Descrpton of the estmators consdered n the smulaton studes. Estmator Descrpton MBDE MBDE (19) wth sample weghts (17) based on model (18) EBP EBLUP-based EBP estmator (5) under lnear mxed model (4) CD EBLUP-based CD estmator (6) under lnear mxed model (4) NPEBP NPEBLUP-based EBP estmator (11) under splne-based mxed model (10) NPCD NPEBLUP-based CD estmator (12) under splne-based mxed model (10) Table 2. Area averages of absolute relatve bas (ARB, %) generated by model-based smulatons. Set Populaton quantle MBDE EBP CD NPEBP NPCD a b

27 Table 3. Area averages of relatve root mean squared error (RRMSE, %) generated by modelbased smulatons. Set Populaton quantle MBDE EBP CD NPEBP NPCD a b Table 4. Average values over 12 regons of absolute relatve bas (ARB, %) and relatve root mean squared error (RRMSE, %) for the AAGIS data. Populaton quantle MBDE EBP CD NPEBP NPCD ARB (%) RRMSE (%)

28 Table 5. Average values over 23 HUCs of absolute relatve bas (ARB,%) and relatve root mean squared error (RRMSE,%) for the EMAP data. Populaton quantle MBDE EBP CD NPEBP NPCD ARB (%) RRMSE (%) Table 6. Average values of true RMSE and estmated RMSE and actual coverage rate (CR, %) of nomnal 95 per cent confdence ntervals generated by the MBDE (19) and assocated MSE estmator (20) for the AAGIS and EMAP data. Averages are over regons. AAGIS EMAP Populaton quantle True Estmated True RMSE Estmated RMSE CR RMSE RMSE CR

29 Fgure 1. Regon-specfc values of actual repeated samplng RMSE (sold lne) and average estmated RMSE (dashed lne) of MBDE (19) for the AAGIS data. Fgure 2. HUC-specfc values of actual repeated samplng RMSE (sold lne) and average estmated RMSE (dashed lne) of MBDE (19) for the EMAP data. 27

Small Area Estimation for Business Surveys

Small Area Estimation for Business Surveys ASA Secton on Survey Research Methods Small Area Estmaton for Busness Surveys Hukum Chandra Southampton Statstcal Scences Research Insttute, Unversty of Southampton Hghfeld, Southampton-SO17 1BJ, U.K.

More information

On Outlier Robust Small Area Mean Estimate Based on Prediction of Empirical Distribution Function

On Outlier Robust Small Area Mean Estimate Based on Prediction of Empirical Distribution Function On Outler Robust Small Area Mean Estmate Based on Predcton of Emprcal Dstrbuton Functon Payam Mokhtaran Natonal Insttute of Appled Statstcs Research Australa Unversty of Wollongong Small Area Estmaton

More information

Bias-correction under a semi-parametric model for small area estimation

Bias-correction under a semi-parametric model for small area estimation Bas-correcton under a sem-parametrc model for small area estmaton Laura Dumtrescu, Vctora Unversty of Wellngton jont work wth J. N. K. Rao, Carleton Unversty ICORS 2017 Workshop on Robust Inference for

More information

Small Area Estimation Under Spatial Nonstationarity

Small Area Estimation Under Spatial Nonstationarity Small Area Estmaton Under Spatal Nonstatonarty Hukum Chandra Indan Agrcultural Statstcs Research Insttute, New Delh Ncola Salvat Unversty of Psa Ray Chambers Unversty of Wollongong Nkos Tzavds Unversty

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Chapter 5 Multilevel Models

Chapter 5 Multilevel Models Chapter 5 Multlevel Models 5.1 Cross-sectonal multlevel models 5.1.1 Two-level models 5.1.2 Multple level models 5.1.3 Multple level modelng n other felds 5.2 Longtudnal multlevel models 5.2.1 Two-level

More information

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

Outlier Robust Small Area Estimation

Outlier Robust Small Area Estimation Unversty of Wollongong Research Onlne Centre for Statstcal & Survey Methodology Workng Paper Seres Faculty of Engneerng and Informaton Scences 009 Outler Robust Small Area Estmaton R. Chambers Unversty

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

Discussion of Extensions of the Gauss-Markov Theorem to the Case of Stochastic Regression Coefficients Ed Stanek

Discussion of Extensions of the Gauss-Markov Theorem to the Case of Stochastic Regression Coefficients Ed Stanek Dscusson of Extensons of the Gauss-arkov Theorem to the Case of Stochastc Regresson Coeffcents Ed Stanek Introducton Pfeffermann (984 dscusses extensons to the Gauss-arkov Theorem n settngs where regresson

More information

Robust Small Area Estimation Using a Mixture Model

Robust Small Area Estimation Using a Mixture Model Robust Small Area Estmaton Usng a Mxture Model Jule Gershunskaya U.S. Bureau of Labor Statstcs Partha Lahr JPSM, Unversty of Maryland, College Park, USA ISI Meetng, Dubln, August 23, 2011 Parameter of

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

Lecture 6: Introduction to Linear Regression

Lecture 6: Introduction to Linear Regression Lecture 6: Introducton to Lnear Regresson An Manchakul amancha@jhsph.edu 24 Aprl 27 Lnear regresson: man dea Lnear regresson can be used to study an outcome as a lnear functon of a predctor Example: 6

More information

Multivariate Ratio Estimator of the Population Total under Stratified Random Sampling

Multivariate Ratio Estimator of the Population Total under Stratified Random Sampling Open Journal of Statstcs, 0,, 300-304 ttp://dx.do.org/0.436/ojs.0.3036 Publsed Onlne July 0 (ttp://www.scrp.org/journal/ojs) Multvarate Rato Estmator of te Populaton Total under Stratfed Random Samplng

More information

Uncertainty as the Overlap of Alternate Conditional Distributions

Uncertainty as the Overlap of Alternate Conditional Distributions Uncertanty as the Overlap of Alternate Condtonal Dstrbutons Olena Babak and Clayton V. Deutsch Centre for Computatonal Geostatstcs Department of Cvl & Envronmental Engneerng Unversty of Alberta An mportant

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

Small area estimation for semicontinuous data

Small area estimation for semicontinuous data Unversty of Wollongong Research Onlne Faculty of Engneerng and Informaton Scences - Papers: Part A Faculty of Engneerng and Informaton Scences 2016 Small area estmaton for semcontnuous data Hukum Chandra

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

Efficient nonresponse weighting adjustment using estimated response probability

Efficient nonresponse weighting adjustment using estimated response probability Effcent nonresponse weghtng adjustment usng estmated response probablty Jae Kwang Km Department of Appled Statstcs, Yonse Unversty, Seoul, 120-749, KOREA Key Words: Regresson estmator, Propensty score,

More information

Statistics II Final Exam 26/6/18

Statistics II Final Exam 26/6/18 Statstcs II Fnal Exam 26/6/18 Academc Year 2017/18 Solutons Exam duraton: 2 h 30 mn 1. (3 ponts) A town hall s conductng a study to determne the amount of leftover food produced by the restaurants n the

More information

Nonparametric model calibration estimation in survey sampling

Nonparametric model calibration estimation in survey sampling Ames February 18, 004 Nonparametrc model calbraton estmaton n survey samplng M. Govanna Ranall Department of Statstcs, Colorado State Unversty (Jont work wth G.E. Montanar, Dpartmento d Scenze Statstche,

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VIII LECTURE - 34 ANALYSIS OF VARIANCE IN RANDOM-EFFECTS MODEL AND MIXED-EFFECTS EFFECTS MODEL Dr Shalabh Department of Mathematcs and Statstcs Indan

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10)

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10) I. Defnton and Problems Econ7 Appled Econometrcs Topc 9: Heteroskedastcty (Studenmund, Chapter ) We now relax another classcal assumpton. Ths s a problem that arses often wth cross sectons of ndvduals,

More information

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method

Comparison of the Population Variance Estimators. of 2-Parameter Exponential Distribution Based on. Multiple Criteria Decision Making Method Appled Mathematcal Scences, Vol. 7, 0, no. 47, 07-0 HIARI Ltd, www.m-hkar.com Comparson of the Populaton Varance Estmators of -Parameter Exponental Dstrbuton Based on Multple Crtera Decson Makng Method

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 008 Recall: man dea of lnear regresson Lnear regresson can be used to study

More information

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding Recall: man dea of lnear regresson Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 8 Lnear regresson can be used to study an

More information

Kernel Methods and SVMs Extension

Kernel Methods and SVMs Extension Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general

More information

A nonparametric two-sample wald test of equality of variances

A nonparametric two-sample wald test of equality of variances Unversty of Wollongong Research Onlne Centre for Statstcal & Survey Methodology Workng Paper Seres Faculty of Engneerng and Informaton Scences 0 A nonparametrc two-sample wald test of equalty of varances

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Computing MLE Bias Empirically

Computing MLE Bias Empirically Computng MLE Bas Emprcally Kar Wa Lm Australan atonal Unversty January 3, 27 Abstract Ths note studes the bas arses from the MLE estmate of the rate parameter and the mean parameter of an exponental dstrbuton.

More information

A note on regression estimation with unknown population size

A note on regression estimation with unknown population size Statstcs Publcatons Statstcs 6-016 A note on regresson estmaton wth unknown populaton sze Mchael A. Hdroglou Statstcs Canada Jae Kwang Km Iowa State Unversty jkm@astate.edu Chrstan Olver Nambeu Statstcs

More information

Non-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT

Non-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT Malaysan Journal of Mathematcal Scences 8(S): 37-44 (2014) Specal Issue: Internatonal Conference on Mathematcal Scences and Statstcs 2013 (ICMSS2013) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

A Bound for the Relative Bias of the Design Effect

A Bound for the Relative Bias of the Design Effect A Bound for the Relatve Bas of the Desgn Effect Alberto Padlla Banco de Méxco Abstract Desgn effects are typcally used to compute sample szes or standard errors from complex surveys. In ths paper, we show

More information

The Ordinary Least Squares (OLS) Estimator

The Ordinary Least Squares (OLS) Estimator The Ordnary Least Squares (OLS) Estmator 1 Regresson Analyss Regresson Analyss: a statstcal technque for nvestgatng and modelng the relatonshp between varables. Applcatons: Engneerng, the physcal and chemcal

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

ANOMALIES OF THE MAGNITUDE OF THE BIAS OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF THE REGRESSION SLOPE

ANOMALIES OF THE MAGNITUDE OF THE BIAS OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF THE REGRESSION SLOPE P a g e ANOMALIES OF THE MAGNITUDE OF THE BIAS OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF THE REGRESSION SLOPE Darmud O Drscoll ¹, Donald E. Ramrez ² ¹ Head of Department of Mathematcs and Computer Studes

More information

F statistic = s2 1 s 2 ( F for Fisher )

F statistic = s2 1 s 2 ( F for Fisher ) Stat 4 ANOVA Analyss of Varance /6/04 Comparng Two varances: F dstrbuton Typcal Data Sets One way analyss of varance : example Notaton for one way ANOVA Comparng Two varances: F dstrbuton We saw that the

More information

STK4080/9080 Survival and event history analysis

STK4080/9080 Survival and event history analysis SK48/98 Survval and event hstory analyss Lecture 7: Regresson modellng Relatve rsk regresson Regresson models Assume that we have a sample of n ndvduals, and let N (t) count the observed occurrences of

More information

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise. Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

University of Wollongong. Research Online

University of Wollongong. Research Online Unversty of Wollongong Research Onlne Centre for Statstcal & Survey Methodology Workng Paper Seres Faculty of Engneerng and Informaton Scences 2009 Borrowng strength over space n small area estmaton: Comparng

More information

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 13 The Smple Lnear Regresson Model and Correlaton 1999 Prentce-Hall, Inc. Chap. 13-1 Chapter Topcs Types of Regresson Models Determnng the Smple Lnear

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

Primer on High-Order Moment Estimators

Primer on High-Order Moment Estimators Prmer on Hgh-Order Moment Estmators Ton M. Whted July 2007 The Errors-n-Varables Model We wll start wth the classcal EIV for one msmeasured regressor. The general case s n Erckson and Whted Econometrc

More information

A Comparative Study for Estimation Parameters in Panel Data Model

A Comparative Study for Estimation Parameters in Panel Data Model A Comparatve Study for Estmaton Parameters n Panel Data Model Ahmed H. Youssef and Mohamed R. Abonazel hs paper examnes the panel data models when the regresson coeffcents are fxed random and mxed and

More information

An (almost) unbiased estimator for the S-Gini index

An (almost) unbiased estimator for the S-Gini index An (almost unbased estmator for the S-Gn ndex Thomas Demuynck February 25, 2009 Abstract Ths note provdes an unbased estmator for the absolute S-Gn and an almost unbased estmator for the relatve S-Gn for

More information

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics ECOOMICS 35*-A Md-Term Exam -- Fall Term 000 Page of 3 pages QUEE'S UIVERSITY AT KIGSTO Department of Economcs ECOOMICS 35* - Secton A Introductory Econometrcs Fall Term 000 MID-TERM EAM ASWERS MG Abbott

More information

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models

Maximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models

More information

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 6 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutons to assst canddates preparng for the eamnatons n future years and for

More information

Parametric fractional imputation for missing data analysis

Parametric fractional imputation for missing data analysis Secton on Survey Research Methods JSM 2008 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Wayne Fuller Abstract Under a parametrc model for mssng data, the EM algorthm s a popular tool

More information

Non-parametric bootstrap mean squared error estimation for M-quantile estimates of small area means, quantiles and poverty indicators *

Non-parametric bootstrap mean squared error estimation for M-quantile estimates of small area means, quantiles and poverty indicators * Non-parametrc bootstrap mean squared error maton for M-quantle mates of small area means quantles and poverty ndcators * Stefano Marchett 1 Monca Prates 2 Nos zavds 3 1 Unversty of Psa e-mal: stefano.marchett@for.unp.t

More information

USE OF DOUBLE SAMPLING SCHEME IN ESTIMATING THE MEAN OF STRATIFIED POPULATION UNDER NON-RESPONSE

USE OF DOUBLE SAMPLING SCHEME IN ESTIMATING THE MEAN OF STRATIFIED POPULATION UNDER NON-RESPONSE STATISTICA, anno LXXV, n. 4, 015 USE OF DOUBLE SAMPLING SCHEME IN ESTIMATING THE MEAN OF STRATIFIED POPULATION UNDER NON-RESPONSE Manoj K. Chaudhary 1 Department of Statstcs, Banaras Hndu Unversty, Varanas,

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

Lecture 3 Stat102, Spring 2007

Lecture 3 Stat102, Spring 2007 Lecture 3 Stat0, Sprng 007 Chapter 3. 3.: Introducton to regresson analyss Lnear regresson as a descrptve technque The least-squares equatons Chapter 3.3 Samplng dstrbuton of b 0, b. Contnued n net lecture

More information

/ n ) are compared. The logic is: if the two

/ n ) are compared. The logic is: if the two STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence

More information

Testing for seasonal unit roots in heterogeneous panels

Testing for seasonal unit roots in heterogeneous panels Testng for seasonal unt roots n heterogeneous panels Jesus Otero * Facultad de Economía Unversdad del Rosaro, Colomba Jeremy Smth Department of Economcs Unversty of arwck Monca Gulett Aston Busness School

More information

Chapter 15 - Multiple Regression

Chapter 15 - Multiple Regression Chapter - Multple Regresson Chapter - Multple Regresson Multple Regresson Model The equaton that descrbes how the dependent varable y s related to the ndependent varables x, x,... x p and an error term

More information

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction ECONOMICS 35* -- NOTE 7 ECON 35* -- NOTE 7 Interval Estmaton n the Classcal Normal Lnear Regresson Model Ths note outlnes the basc elements of nterval estmaton n the Classcal Normal Lnear Regresson Model

More information

This column is a continuation of our previous column

This column is a continuation of our previous column Comparson of Goodness of Ft Statstcs for Lnear Regresson, Part II The authors contnue ther dscusson of the correlaton coeffcent n developng a calbraton for quanttatve analyss. Jerome Workman Jr. and Howard

More information

β0 + β1xi and want to estimate the unknown

β0 + β1xi and want to estimate the unknown SLR Models Estmaton Those OLS Estmates Estmators (e ante) v. estmates (e post) The Smple Lnear Regresson (SLR) Condtons -4 An Asde: The Populaton Regresson Functon B and B are Lnear Estmators (condtonal

More information

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications Durban Watson for Testng the Lack-of-Ft of Polynomal Regresson Models wthout Replcatons Ruba A. Alyaf, Maha A. Omar, Abdullah A. Al-Shha ralyaf@ksu.edu.sa, maomar@ksu.edu.sa, aalshha@ksu.edu.sa Department

More information

Statistics for Business and Economics

Statistics for Business and Economics Statstcs for Busness and Economcs Chapter 11 Smple Regresson Copyrght 010 Pearson Educaton, Inc. Publshng as Prentce Hall Ch. 11-1 11.1 Overvew of Lnear Models n An equaton can be ft to show the best lnear

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

Polynomial Regression Models

Polynomial Regression Models LINEAR REGRESSION ANALYSIS MODULE XII Lecture - 6 Polynomal Regresson Models Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Test of sgnfcance To test the sgnfcance

More information

Convergence of random processes

Convergence of random processes DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large

More information

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 9: Statistical Inference and the Relationship between Two Variables Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,

More information

18. SIMPLE LINEAR REGRESSION III

18. SIMPLE LINEAR REGRESSION III 8. SIMPLE LINEAR REGRESSION III US Domestc Beers: Calores vs. % Alcohol Ftted Values and Resduals To each observed x, there corresponds a y-value on the ftted lne, y ˆ ˆ = α + x. The are called ftted values.

More information

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2017 Instructor: Victor Aguirregabiria

ECONOMETRICS II (ECO 2401S) University of Toronto. Department of Economics. Winter 2017 Instructor: Victor Aguirregabiria ECOOMETRICS II ECO 40S Unversty of Toronto Department of Economcs Wnter 07 Instructor: Vctor Agurregabra SOLUTIO TO FIAL EXAM Tuesday, Aprl 8, 07 From :00pm-5:00pm 3 hours ISTRUCTIOS: - Ths s a closed-book

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

28. SIMPLE LINEAR REGRESSION III

28. SIMPLE LINEAR REGRESSION III 8. SIMPLE LINEAR REGRESSION III Ftted Values and Resduals US Domestc Beers: Calores vs. % Alcohol To each observed x, there corresponds a y-value on the ftted lne, y ˆ = βˆ + βˆ x. The are called ftted

More information

e i is a random error

e i is a random error Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where + β + β e for,..., and are observable varables e s a random error How can an estmaton rule be constructed for the unknown

More information

Now we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity

Now we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity ECON 48 / WH Hong Heteroskedastcty. Consequences of Heteroskedastcty for OLS Assumpton MLR. 5: Homoskedastcty var ( u x ) = σ Now we relax ths assumpton and allow that the error varance depends on the

More information

Chapter 6. Supplemental Text Material

Chapter 6. Supplemental Text Material Chapter 6. Supplemental Text Materal S6-. actor Effect Estmates are Least Squares Estmates We have gven heurstc or ntutve explanatons of how the estmates of the factor effects are obtaned n the textboo.

More information

Small Area Estimation: Methods, Applications and New Developments. J. N. K. Rao. Carleton University, Ottawa, Canada

Small Area Estimation: Methods, Applications and New Developments. J. N. K. Rao. Carleton University, Ottawa, Canada Small Area Estmaton: Methods, Applcatons and New Developments J. N. K. Rao Carleton Unversty, Ottawa, Canada Paper presented at the NTTS 2013 Conference, Brussels, March 2013 1 Introducton Censuses and

More information

NUMERICAL DIFFERENTIATION

NUMERICAL DIFFERENTIATION NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the

More information

Economics 130. Lecture 4 Simple Linear Regression Continued

Economics 130. Lecture 4 Simple Linear Regression Continued Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do

More information

Chapter 14 Simple Linear Regression

Chapter 14 Simple Linear Regression Chapter 4 Smple Lnear Regresson Chapter 4 - Smple Lnear Regresson Manageral decsons often are based on the relatonshp between two or more varables. Regresson analss can be used to develop an equaton showng

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

Small area prediction of counts under a nonstationary

Small area prediction of counts under a nonstationary Unversty of Wollongong Research Onlne Faculty of Engneerng and Informaton Scences - Papers: Part A Faculty of Engneerng and Informaton Scences 207 Small area predcton of counts under a nonstatonary spatal

More information

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva Econ 39 - Statstcal Propertes of the OLS estmator Sanjaya DeSlva September, 008 1 Overvew Recall that the true regresson model s Y = β 0 + β 1 X + u (1) Applyng the OLS method to a sample of data, we estmate

More information

Conditional and unconditional models in modelassisted estimation of finite population totals

Conditional and unconditional models in modelassisted estimation of finite population totals Unversty of Wollongong Research Onlne Faculty of Informatcs - Papers Archve) Faculty of Engneerng and Informaton Scences 2011 Condtonal and uncondtonal models n modelasssted estmaton of fnte populaton

More information

Topic 23 - Randomized Complete Block Designs (RCBD)

Topic 23 - Randomized Complete Block Designs (RCBD) Topc 3 ANOVA (III) 3-1 Topc 3 - Randomzed Complete Block Desgns (RCBD) Defn: A Randomzed Complete Block Desgn s a varant of the completely randomzed desgn (CRD) that we recently learned. In ths desgn,

More information

Bayesian predictive Configural Frequency Analysis

Bayesian predictive Configural Frequency Analysis Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse

More information

Explaining the Stein Paradox

Explaining the Stein Paradox Explanng the Sten Paradox Kwong Hu Yung 1999/06/10 Abstract Ths report offers several ratonale for the Sten paradox. Sectons 1 and defnes the multvarate normal mean estmaton problem and ntroduces Sten

More information