Nonparametric Regression Estimation. of Finite Population Totals. under Two-Stage Sampling
|
|
- Everett Atkinson
- 5 years ago
- Views:
Transcription
1 Nonparametrc Regresson Estmaton of Fnte Populaton Totals under Two-Stage Samplng J-Yeon Km Iowa State Unversty F. Jay Bredt Colorado State Unversty Jean D. Opsomer Iowa State Unversty ay 21, 2003 Abstract A nonparametrc regresson estmator for the fnte populaton total n two-stage samplng wth complete stage-one auxlary nformaton s developed. The estmator, based on local polynomal regresson, s a lnear combnaton of cluster total estmators, wth weghts that are calbrated to known control totals. The estmator s asymptotcally desgn-unbased and desgn consstent under mld assumptons, and ts varance can be consstently estmated. Smulaton results ndcate that the nonparametrc estmator domnates several parametrc estmators when the model regresson functon s ncorrectly specfed, whle beng nearly as effcent when the parametrc specfcaton s 1
2 correct. The methodology s llustrated usng data from a study of land use and eroson. Keywords: Auxlary nformaton, calbraton, model-asssted estmaton, local polynomal regresson, cluster samplng, eroson. 1 Introducton In many complex surveys, auxlary nformaton about the populaton of nterest s avalable. In trackng human dsease or montorng natural resources, for example, geographc nformaton systems may contan locaton-specfc ndces for all stes n the study. One approach to usng ths auxlary nformaton n estmaton s to assume a workng model ξ descrbng the relatonshp between the study varable of nterest and the auxlary varables. Estmators are then derved on the bass of ths model. Estmators are sought that have good effcency f the model s true, but mantan desrable propertes lke asymptotc desgn unbasedness and desgn consstency f the model s false. Often, a lnear model s selected as the workng model. Generalzed regresson estmators (e.g., Cassel, Särndal, and Wretman, 1976, 1977; Särndal, 1980; Robnson and Särndal, 1983), ncludng rato estmators and lnear regresson estmators (Cochran, 1977), best lnear unbased estmators (Brewer, 1963; Royall, 1970), and poststratfcaton estmators (Holt and Smth, 1979), are all derved from assumed lnear models. In some stuatons, the lnear model s not approprate, and the resultng estmators do not acheve any effcency gan over purely desgn-based estmators. Wu and Stter (2001) propose a class of estmators for whch the workng models follow a nonlnear parametrc shape. The effcent use of nonlnear models, however, requres a pror 2
3 knowledge of the specfc parametrc structure of the populaton. Ths s especally problematc f that same model s to be used for many varables of nterest, a common occurrence n surveys. Because of these concerns, some researchers have consdered nonparametrc models for ξ. Dorfman (1992) and Chambers, Dorfman, and Wehrly (1993) developed modelbased nonparametrc estmators usng ths approach. Bredt and Opsomer (2000) proposed a new type of model-asssted nonparametrc regresson estmator for the fnte populaton total, based on local polynomal smoothng. The local polynomal regresson estmator has the form of the generalzed regresson estmator, but s based on a nonparametrc superpopulaton model applcable to a much larger class of functons. The theory developed n Bredt and Opsomer (2000) for the local polynomal regresson estmator apples only to drect element samplng desgns wth auxlary nformaton avalable for all elements of the populaton. In many large-scale surveys, however, more complex desgns such as multstage or multphase samplng desgns wth varous types of auxlary nformaton are commonly used. In ths paper, we extend nonparametrc regresson estmaton to two-stage samplng, n whch a probablty sample of clusters s selected, and then subsamples of elements wthn each selected cluster are obtaned. Such two-stage samplng s frequently used because an adequate frame of elements s not avalable or would be prohbtvely expensve to construct, but a lstng of clusters s avalable. Famlar examples nclude humans wthn households, fsh wthn lakes, and trees wthn plots. In such cases, t s more lkely that detaled auxlary nformaton would be avalable for the clusters but not the elements. Therefore, we consder local polynomal regresson estmaton n two-stage element samplng wth auxlary nformaton avalable for all clusters. Results for sngle-stage cluster samplng, n whch 3
4 each sampled cluster s completely enumerated, are obtaned as a specal case. The case of mult-stage element samplng wth auxlary nformaton avalable for all prmary samplng unts s an mmedate extenson of the results n ths paper, but wll not be explctly developed here. In Secton 1.1, we descrbe our two-stage samplng framework and ntroduce approprate notaton. In Secton 1.2, we adapt the local polynomal regresson estmator of Bredt and Opsomer (2000) to two-stage samplng and n Secton 1.3 we ntroduce assumptons used n our theoretcal dervatons. Desgn propertes of the estmator are descrbed n Secton 2. Secton 2.1 shows that the estmator s a lnear combnaton of estmators of cluster totals wth weghts that are calbrated to known control totals. Secton 2.2 shows asymptotc desgn unbasedness and desgn consstency of the estmator, approxmates the estmator s desgn mean squared error, and provdes a consstent estmator of the desgn mean squared error. Secton 3 descrbes results of a smulaton study, n whch the local polynomal regresson estmator competes well wth a number of other parametrc and nonparametrc estmators, across a broad range of study varables. In Secton 4, we apply the estmator to data from a 1995 study of eroson, usng Natonal Resources Inventory (NRI) data as frame materals, and conclude wth a bref dscusson n Secton 5. All proofs are gathered n an appendx. 1.1 Notaton Consder a fnte populaton of elements U = {1,..., k,..., N} parttoned nto clusters, U 1,..., U,..., U. The populaton of clusters s represented as C = {1,...,,..., }. The number of elements n the th cluster U s denoted N. We have U = U and N = N. For all clusters C, an auxlary vector x = (x 1,..., x G) s aval- 4
5 able. For the sake of smplcty we assume that G = 1; that s, the x are scalars. At stage one, a probablty sample s of clusters s drawn from C accordng to a fxed sze desgn p I ( ), where p I (s) s the probablty of drawng the sample s from C. Let m be the sze of s. The cluster ncluson probabltes = Pr { s} = s: s p I(s) and j = Pr {, j s} = s:,j s p I(s) are assumed to be strctly postve. For every sampled cluster s, a probablty sample s of elements s drawn from U accordng to a fxed sze desgn p ( ) wth ncluson probabltes π k and π kl. That s, p (s ) s the probablty of drawng s from U gven that the th cluster s chosen at stage one. The sze of s s denoted n. Assume that π k = Pr {k s s } = s :k s p (s ) and π kl = Pr {k, l s s } = s :k,l s p (s ) are strctly postve. As s customary for two-stage samplng, we assume nvarance and ndependence of the second-stage desgn. Invarance of the second-stage desgn means that for every, and for every s, p ( s) = p ( ). That s, the same wthn-cluster desgn s used whenever the th cluster s selected, regardless of what other clusters are selected. Independence of the second-stage desgn means that subsamplng n a gven cluster s ndependent of subsamplng n any other cluster. The whole sample of elements and ts sze are s s and s n, respectvely. The study varable y k s observed for k s s. The parameter to estmate s the populaton total t y = k U y k = t, where t = k U y k s the th cluster total. Let I = 1 f s and I = 0 otherwse. Note that E p [I ] = E I [E II [I ]] = E I [I ] =, where E p [ ] denotes expectaton wth respect to the samplng desgn, E I [ ] denotes expectaton wth respect to stage one, and E II [ ] denotes condtonal expectaton wth respect to stage two gven s. Also, V I ( ) and V II ( ) denote varances wth respect to stage one and two, respectvely. Usng ths notaton, an estmator ˆt of t s sad to be ] desgn-unbased f E p [ˆt = t. 5
6 The smple expanson estmator of t y n two-stage element samplng s gven by where ˆt y = s ˆt = ˆt I, (1) π ˆt = k s s the Horvtz-Thompson (1952) estmator of t wth respect to the second stage of samplng. We wll refer to (1) as the Horvtz-Thompson (HT) estmator. Snce ˆt s y k π k desgn-unbased for t, the HT estmator ˆt y s desgn-unbased for t y. The varance of the HT estmator ˆt y under the samplng desgn can be wrtten as the sum of two components, Var p (ˆt y ) = V I ( E II [ˆt y ]) + E I [ V II (ˆt y )] =,j C (j π j ) t t j + V, (2) π j π where V = V II (ˆt ) = (π kl π k π l ) y k y l π k,l U k s the varance of ˆt wth respect to stage two. A desgn-unbased estmator of V s gven by ˆV = π kl π k π l y k y l. π k,l s kl π k π l Note that V s non-random due to nvarance. Note also that the result for sngle-stage cluster samplng, n whch all elements n each selected cluster are observed, s obtaned f we set ˆt = t and V = ˆV = 0 for all C. See, for example, Särndal, Swensson, and Wretman (1992, Result 4.3.1). π l 6
7 1.2 Local Polynomal Regresson Estmator The model-asssted approach to usng auxlary nformaton {x } s to assume as a workng model that the fnte populaton pont scatter {(x, t )} s a realzaton from an nfnte superpopulaton model ξ, n whch t = µ(x ) + ε, (3) where the ε are ndependent random varables wth E ξ [ε ] = 0 and Var ξ (ε ) = ν(x ). Typcally, both µ(x) and ν(x) are taken to be parametrc functons of x, such as the lnear specfcaton µ(x) = j β ja µ,j (x) and ν(x) = j λ ja ν,j (x), where the a µ,j and a ν,j are known functons and the β j and λ j are unknown parameters. A varety of heteroskedastc polynomal regresson models could be specfed n ths way (e.g., Särndal, Swensson, and Wretman, 1992, Secton 8.4). As mentoned n Secton 1, the model-asssted methodology offers effcency gans f the workng model descrbes the fnte populaton pont scatter reasonably well. The problem s that, n an actual survey, there s not a sngle pont scatter, but many, correspondng to dfferent study varables t. Standard survey practce s to use the workng model to construct one set of weghts that reflects the desgn and the auxlary nformaton n {x }, and apply ths one set of weghts to all study varables. Thus, t s crtcal to keep the model specfcaton flexble. Ths s the motvaton for the nonparametrc approach that we employ. Rather than specfy a parametrc model, we assume only that µ(x) s a smooth functon of x. Ths nonparametrc workng model has the potental to offer effcency gans for a greater varety of study varables than the parametrc model, whle mantanng most of the effcency of the parametrc regresson estmator f the parametrc model s correct. We now ntroduce some further notaton used n the nonparametrc regresson. Let 7
8 K denote a kernel functon and h denote ts bandwdth. Let t C = [t ] be the vector of t s n the populaton of clusters. Defne the (q + 1) matrx 1 x 1 x (x 1 x ) q [ ] X C =... = 1 x j x (x j x ) q 1 x x (x x ) q and defne the matrx { 1 W C = dag K h ( )} xj x Let e r represent the rth column of the dentty matrx. The local polynomal regresson estmator of µ(x ), based on the entre fnte populaton of clusters, s then gven by h j C. j C, µ = e ( 1 X ) 1 C W C X C X C W C t C = w Ct C, (4) whch s well-defned as long as X CW C X C s nvertble. Ths s the tradtonal local polynomal kernel estmator descrbed n e.g. Wand and Jones (1995). If these µ s were known, then a desgn-unbased estmator of t y would be the two-stage analogue of the generalzed dfference estmator (Särndal, Swensson, and Wretman, 1992, p. 222), t y = s ˆt µ + µ. (5) The desgn varance of (5) s Var p (t y ) =,j C (j π j ) t µ t j µ j π j + V, (6) whch depends on resduals from the nonparametrc regresson and hence s expected to be smaller than (2). Note that, snce a model s assumed for the cluster totals but not for the ndvdual observatons, only the varance component at the cluster level n (6) s affected by the model. 8
9 In the present context, the populaton estmator µ cannot be calculated because only the y k n s s are known. Therefore, we wll replace each µ by a sample-based consstent estmator. Let ˆt s = [ˆt ] s be the vector of ˆt s obtaned n the sample of clusters. Defne the m (q + 1) matrx [ X s = 1 x j x (x j x ) q ] j s, (7) and defne the m m matrx W s = dag { 1 π j h K ( ) xj x } h A desgn-based sample estmator of µ s then gven by j s. (8) ˆµ o = e ( 1 X ) 1 s W s X s X s W sˆt s = w o sˆt s, (9) as long as X sw s X s s nvertble. Ths estmator dffers from tradtonal local polynomal regresson because of the ncluson of the samplng weghts and the fact that the cluster totals are estmated, not observed. In a desgn-based context, these adjustments mply that ˆµ o s an estmator of µ, the populaton ft, but not an estmator of µ(x ), the model mean at x. Substtutng ˆt and ˆµ o respectvely for t and µ n (5), we have the local polynomal regresson estmator for the populaton total of y t o y = s ˆt ˆµ o + ˆµ o. (10) In theory, the estmator (9) can be undefned for some C even f the populaton estmator n (4) s well-defned. As n Bredt and Opsomer (2000), we wll consder an adjusted sample estmator for the theoretcal dervatons n Secton 2. The adjusted sample estmator for µ s gven by ˆµ = e 1 ( X sw s X s + dag { δ 2 9 } q+1 j=1 ) 1 X sw sˆt s = w sˆt s, (11)
10 for some small δ > 0. The value δ 2 n (11) s a small order adjustment that guarantees the estmator s exstence for any s C, as long as the populaton estmator n (4) s defned for all C. Ths adjustment was also used by Fan (1992) n the study of the theoretcal propertes of local polynomal regresson. We let t y = s ˆt ˆµ + ˆµ (12) denote the local polynomal regresson estmator that uses the adjusted sample estmator n (11). The estmator for sngle-stage cluster samplng s obtaned f we set ˆt = t for all C. odelng the cluster totals as n (3) s not the only possble approach. Another possblty s to model the cluster means as N 1 t = α(x ) + ε, (13) where the ε are ndependent random varables wth mean zero and varance ν(x ), α(x) s smooth, and ν(x) s smooth and strctly postve. In ths case ˆµ n (12) would be replaced by N ˆα, where the ˆα are obtaned va nonparametrc regresson of N 1 ˆt on x, usng the local desgn matrx (7) and local weghtng matrx (8). odel (13) wll not be consdered further n ths paper. 1.3 Assumptons To prove our theoretcal results, we adopt an asymptotc framework n whch both the populaton number of clusters,, and the sample number of clusters, m, tend to nfnty. The number of elements wthn each cluster, N, remans bounded, so that no cluster domnates the populaton. Subsamplng wthn selected clusters s carred out as descrbed n Secton 1.1. We make the followng addtonal assumptons on the study varable, the desgn, and the smoothng methodology: 10
11 A1 Dstrbuton of the errors under ξ: the errors ε are ndependent and have mean zero, varance ν(x ), and compact support, unformly for all. A2 For each, the x are consdered fxed wth respect to the superpopulaton model ξ. The x are ndependent and dentcally dstrbuted F (x) = x f(t)dt, where f( ) s a densty wth compact support [a x, b x ] and f(x) > 0 for all x [a x, b x ]. A3 The mean functon µ s contnuous on [a x, b x ]. A4 Kernel K: the kernel K( ) has compact support [ 1, 1], s symmetrc and contnuous, and satsfes 1 1 K(u) du = 1. A5 Frst-stage samplng rate m 1, bandwdth h, and cluster sze N : as, m 1 π (0, 1), h 0, h 2 /(log log ), and N s unformly bounded above for all clusters and for all. A6 Frst-stage (cluster) ncluson probabltes and j : for all, mn λ > 0, mn,j C j λ > 0 and lm sup m max j π j <.,j C: j A7 Addtonal assumptons nvolvng hgher-order frst-stage ncluson probabltes: lm m2 EI max [(I 1 1 )(I 2 2 )(I 3 3 )(I 4 4 )] <, ( 1, 2, 3, 4 ) D 4, 11
12 where D t, denotes the set of all dstnct t-tuples ( 1, 2,..., t ) from C, and lm lm sup lm sup EI max [(I 1 I )(I 3 I )] = 0, ( 1, 2, 3, 4 ) D 4, [ EI m max (I 1 1 ) 2 (I 2 2 )(I 3 3 )] <, ( 1, 2, 3 ) D 3, m 2 EI max [(I 1 1 )(I 2 2 )(I 3 3 )] <. ( 1, 2, 3 ) D 3, A8 The second-stage desgn s nvarant and ndependent, wth n 1 for every s and for every possble frst-stage sample s. Further, the second-stage ncluson probabltes are unformly bounded away from zero for all clusters and all. A9 The second-stage jont ncluson probabltes are unformly bounded away from zero for all clusters and all. Remarks: 1. The assumptons n A1 and A3 are weaker than those n standard kernel regresson, because we are not attemptng to estmate the superpopulaton mean functon µ( ), but only the fnte populaton nonparametrc fts, {µ } =1. 2. Assumptons A1 A7 are adapted from Bredt and Opsomer (2000) to the twostage samplng case. The last expresson added n A7 for ths case becomes ( ) N 2 ρ 3 3ρ 2 ρ 1 + 2ρ 3 1 = O (1), wth the notaton that ρ k s the kth order ncluson probablty of k dstct elements under smple random samplng wthout replacement. Straghtforward 12
13 extenson of the results n that paper shows that the desgn assumptons wll hold for smple random samplng of clusters, stratfed smple random samplng of clusters, and related desgns. 3. The subsamplng desgn can be qute general, but s subject to the mld restrctons mposed by A8 (for consstent total estmaton) and A9 (for consstent varance estmaton). In partcular, A8 together wth the bounded cluster szes and bounded error terms mples that t and V are unformly bounded for all clusters and all, whch wll be used n several of the proofs. If clusters are completely enumerated, then A8 and A9 are satsfed trvally, and the results of Bredt and Opsomer (2000) can be appled drectly. See Remark (v) of Secton 1.3 of that paper. 4. For the second stage, we assume n A8 that n 1 for every s, and for every possble frst-stage sample s so that ˆt and ˆV are well-defned. Alternatvely, we can let ˆt = 0 and ˆV = 0 when n = 0 for s to make ˆt and ˆV well-defned. 2 an Results 2.1 Weghtng and Calbraton The nonparametrc regresson estmator can be expressed as a lnear combnaton of the study varables, wth weghts that do not depend on the study varables. These weghts are extremely useful n practce. From (11) and (12), note that t y = 1 + ( 1 I ) j w s π sje j C j ˆt = s ω sˆt (14) 13
14 = s k s ω s π k y k. Thus, t y s a lnear combnaton of the ˆt s n s, wth cluster weghts {ω s } that are the samplng weghts of clusters, sutably modfed to reflect auxlary nformaton [x ]. Alternatvely, t y s a lnear combnaton of the y k s n s s, wth element weghts {ω s π 1 k } that reflect both the desgn and the auxlary nformaton. Because both sets of weghts are ndependent of the study varables, they can be appled to any study varable of nterest. In partcular, the weghts ω s could be appled to x l. If δ = 0, then ω s = ω o s and s ω o sx l = x l for l = 0, 1,..., q. That s, the weghts are exactly calbrated to the q +1 known control totals N, t x,..., t x q. If µ(x ) s exactly a qth degree polynomal, then the uncondtonal expectaton (wth respect to desgn and model) of t o y t y s exactly zero. If δ 0, then ths calbraton property holds approxmately. 2.2 Asymptotc Results In general, the local polynomal regresson estmator t y s not desgn-unbased because the ˆµ are nonlnear functons of desgn-unbased estmators. However, t y s asymptotcally desgn-unbased and desgn consstent under mld condtons. Theorem 1 In two-stage element samplng, and under A1 A8, the local polynomal regresson estmator t y = { (ˆt ˆµ ) I } + ˆµ s asymptotcally desgn-unbased (ADU) n the sense that lm E p ] [ t y t y = 0 wth ξ-probablty one, 14
15 and s desgn consstent n the sense that for all η > 0. [ ] lm E p I { t y t y >η} = 0 wth ξ-probablty one Under the same condtons as n Theorem 1, we obtan the asymptotc desgn mean squared error of the local polynomal regresson estmator t y n two-stage element samplng. The asymptotc desgn mean squared error conssts of frst- and second-stage varance components, and s equvalent to the varance of the generalzed dfference estmator, gven n (6). As noted after equaton (6) above, the second-stage varance s unaffected by the regresson estmaton at the cluster level. Theorem 2 In two-stage element samplng, and under A1 A8, ) 2 ( t y t y me p = m 2,j C (t µ )(t j µ j ) j π j π j + m 2 V + o(1). The next result shows that the asymptotc desgn mean squared error can be estmated consstently under mld assumptons. Theorem 3 In two-stage element samplng, and under A1 A9, where and ˆV ( 1 t y ) = 1 2 lm me p,j C ASE( 1 t y ) = 1 2 ˆV ( 1 t y ) ASE( 1 t y ) = 0, (ˆt ˆµ )(ˆt j ˆµ j ) j π j π j,j C I I j + 1 j 2 ˆV I (t µ )(t j µ j ) j π j + 1 V π j 2. 15
16 Therefore, ˆV ( 1 t y ) s asymptotcally desgn-unbased and desgn consstent for ASE( 1 t y ). Usng the weghted resdual technque (Särndal, Swensson, and Wretman, 1989), we could construct an alternatve varance estmator wth the local polynomal regresson weghts ω s n (14), ˆV w ( 1 t y ) = 1 2,j s ω s (ˆt ˆµ )ω js (ˆt j ˆµ j ) j π j j ωs 2 ˆV. Analogous results for the generalzed regresson estmator are gven n Result of Särndal, Swensson, and Wretman (1992). s 3 Smulaton Results We performed some smulaton experments n order to compare the performance of the local polynomal regresson estmator n two-stage element samplng wth that of several parametrc and nonparametrc estmators. The estmators consdered are the same as those n Bredt and Opsomer (2000) adapted to the two-stage case, and are denoted as follows: HT Horvtz-Thompson equaton (1) REG lnear regresson Särndal, Swensson, and Wretman (1992, p. 309) REG3 cubc regresson PS poststratfcaton Cochran (1977, p. 134) LPR0 local polynomal wth q = 0 equaton (12) LPR1 local polynomal wth q = 1 equaton (12) KERN model-based nonparametrc Dorfman (1992) CDW bas-calbrated nonparametrc Chambers, Dorfman, and Wehrly (1993) 16
17 The frst four estmators are parametrc and the last four are nonparametrc. Of the parametrc estmators, HT s purely desgn-based and REG and REG3 are modelasssted. For the poststratfcaton estmator, we dvde the x-range nto ten equallyspaced strata. The number of poststrata was chosen to ensure a very small probablty of empty poststrata. Among the four nonparametrc estmators, LPR0 and LPR1 are model-asssted and KERN and CDW are model-based. KERN and CDW consdered here are extended versons of estmators proposed n Dorfman (1992) and Chambers et al. (1993) to two-stage element samplng wth auxlary nformaton avalable for all clusters. Snce the cluster totals t are unknown for sampled clusters s, the HT estmators ˆt are nstead used to construct KERN and CDW. In KERN, the mean functon s estmated va nonparametrc regresson of cluster total estmators ˆt s = [ˆt ] s on {x } s, and ths estmated mean functon s used to predct each non-sampled cluster total t. In CDW, we take µ(x) = xβ, ν(x) = σ 2 as a workng parametrc model ξ. Each non-sampled cluster total t s frst predcted by estmatng ts parametrc mean functon under ξ wth cluster total estmators ˆt s = [ˆt ] s, and then ts bas s predcted usng nonparametrc regresson to defne a predctor of the cluster total robust to msspecfcaton of the workng model. Note that the robust predctor can equally be vewed as a bas-adjusted verson of a nonparametrc predctor of t under the workng model ξ. In both KERN and CDW, the Nadaraya-Watson estmator s used. The Epanechnkov kernel, K(t) = 3 4 (1 t2 )I { t 1}, 17
18 and two bandwdth values (0.1 and 0.25) are used for all nonparametrc estmators. The frst bandwdth s equal to the poststratum wdth and the second s based on an ad hoc rule of 1/4th the data range. Bandwdth selecton for local polynomal regresson wll be explored at a later date. Followng Bredt and Opsomer (2000), we consder several mean functons of the cluster totals: lnear: µ 1 (x) = 1 + 2(x 0.5), quadratc: µ 2 (x) = 1 + 2(x 0.5) 2, bump: µ 3 (x) = 1 + 2(x 0.5) + exp( 200(x 0.5) 2 ), jump: µ 4 (x) = {1 + 2(x 0.5)I {x 0.65} } I {x>0.65}, exponental: cycle1: cycle4: µ 5 (x) = exp( 8x), µ 6 (x) = 2 + sn(2πx), µ 7 (x) = 2 + sn(8πx), wth x [0, 1]. These represent a range of correct and ncorrect model specfcatons for the varous estmators consdered. For µ 1, REG and CDW are expected to perform better than others because the model s correctly specfed. The mean functon µ 2 s quadratc, so that t s smooth but far from lnear. The functon µ 3 s smooth and nearly lnear, µ 4 s not smooth, and µ 5 s an exponental curve. The functons µ 6 and µ 7 are snusods wth perod 1 and 0.25, respectvely. The populaton x of sze = 1000 are generated as ndependent and dentcally dstrbuted (d) unform(0,1) random varables. For each generated value x and each study varable j = 1,, 7, N element values are generated as y jk = µ j(x ) N + ε jk N 1/2, {ε jk } d N(0, σ 2 ) where k U. Thus t j has mean µ j (x ) and varance ν j (x ) = σ 2. Two values for the 18
19 standard devaton of the errors are used: σ = 0.1 and 0.4. At stage one, a sample of clusters s frst generated by smple random samplng wth sample sze m = 100 and then, at stage two, subsamples of elements wthn each selected cluster are generated by smple random samplng usng sample sze n. We have consdered three cases wth dfferent second-stage samplng rates: constant cluster sze N = 100 wth n = 10, constant cluster sze N = 100 wth n = 100, and random cluster sze N dstrbuted as Posson(3) + 1 wth n = 0.5N + 1, where a denotes the nteger part of a. The results for constant cluster sze N = 100 wth n = 100 are smlar to those for the element samplng case wth samplng rate 0.1 n Bredt and Opsomer (2000). In the case of constant cluster sze N = 100 wth n = 10, the local lnear regresson estmator does not gan a large amount of effcency, snce t s relatvely dffcult to fnd an dentfable pattern n the plot of the relatonshp between the auxlary varable x and estmated cluster total ˆt due to low second-stage samplng rate. As the second-stage samplng rate ncreases, the local lnear regresson estmator gans more mprovement n effcency over the other estmators. Here, we only report on the experment wth the random cluster szes. Such clusters of moderate and varable sze mght be encountered n a household survey, for nstance. For each combnaton of mean functon, standard devaton and bandwdth, 1000 replcate two-stage element samples from the fnte populaton are selected and then the estmators are calculated. Table 1 shows the ratos of desgn mean squared errors (SEs) for all estmators mentoned above to that for the local polynomal regresson estmator wth q = 1 (LPR1). Overall, the performance of the LPR1 estmator s good, partcularly at the small value of σ. As s expected, REG and CDW perform best for the lnear study varable. In general, both parametrc and nonparametrc estmators perform better 19
20 than the HT estmator for all study varables and have more effcency at the small value of σ. Among the parametrc estmators, REG3 and PS generally perform better than REG except n the lnear study varable. LPR1 s compettve or better than the parametrc estmators n most cases, but PS n the bump study varable and REG3 and PS n the cycle1 and cycle4 study varables are much better than the oversmoothed LPR1 estmator. Compared to other nonparametrc estmators, LPR1 s compettve or better n most cases, wth the effcency gan dependng on study varable, bandwdth, and ther nteracton. To assess further the effect of bandwdth on the nonparametrc estmators, we consdered three large bandwdths h = 0.5, 1.0, and 1.5, but do not table the results here. As the bandwdth becomes large, LPR1 becomes equvalent to REG and the performance of LPR0 and KERN becomes smlar to that of HT, as expected theoretcally. CDW becomes theoretcally equvalent to the classcal regresson estmator wth a lnear ft through the orgn, whch s less effcent than REG, as the bandwdth becomes large. In summary, LPR1 s at least as good as HT for all study varables, bandwdths, and nose levels, and s sometmes much better. LPR1 s more effcent than REG for all study varables but lnear, whle beng nearly as effcent n the lnear case. LPR1 s at least as good as the remanng parametrc estmators (REG3 and PS), wth solated exceptons n whch the parametrc specfcaton s very nearly correct and LPR oversmooths. Fnally, LPR1 domnates the other nonparametrc estmators; t s sometmes much better than each, and s never much worse (SE ratos 0.93). 20
21 4 Example: Eroson study from the Natonal Resources Inventory In ths secton, we apply local polynomal regresson estmaton to data from the 1995 Natonal Resources Inventory Eroson Update Study (see Bredt and Fuller, 1999). The Natonal Resources Inventory (NRI) s a stratfed two-stage area sample of the agrcultural lands n the Unted States conducted by the Natural Resources Conservaton Servce (NRCS) of the U.S. Department of Agrculture (Bredt, 2001). The 1995 Eroson Update Study was a smaller-scale study usng NRI nformaton as frame materal. In the 1995 study, frst-stage samplng strata were 14 states n the dwest and Great Plans regons and prmary samplng unts (PSUs) were countes wthn states. A categorcal varable was used for wthn-county stratfcaton n second-stage samplng. Second-stage samplng unts (SSUs) were NRI segments of land, 160 acres n sze. The auxlary varable for each county was x, the square root of a sze measure of land wth eroson potental. (We used square root to reduce the sparseness of ponts n the regressor space.) The varables of nterest were two knds of eroson measurements, roughly characterzed as wnd eroson (WEQ) and water eroson (USLE). At stage one, a sample of 213 countes was selected by stratfed samplng from the populaton of 1357 countes, wth probablty proportonal to x 2. Subsamples of NRI segments wthn the selected countes were selected by stratfed unequal probablty samplng at stage two. In total, 1900 NRI segments were selected. The Horvtz-Thompson (HT), lnear regresson (REG), and local lnear regresson (LPR1) estmates for WEQ and USLE totals and the correspondng varance estmates were calculated from the sample. We calculated REG estmates wth three dfferent 21
22 varances of the errors (ν(x) x 2, x 4, and x 8 ), denoted by REG2, REG4, and REG8 respectvely. Weghted regressons were used because the data dsplayed large amounts of heteroskedastcty (see Fgure 1), whch can have an effect on the parametrc ft. The Epanechnkov kernel wth three dfferent bandwdths (h = 1, 3, and 5) was used for the LPR1 estmator. Because of data sparseness, the smallest allowable bandwdth for ths example s (to the nearest tenth) h = 1. Table 2 shows HT, REG and LPR1 estmates of WEQ and USLE totals and estmated standard errors. For each estmator, weghts were constructed and appled to both study varables. Standard errors were estmated by assumng unequal-probablty wth-replacement samplng wthn desgn strata at stage one, and unequal-probablty wth-replacement samplng wthn clusters at stage two. Usng the estmated standard errors as a gude, LPR1 wth h = 1 performs best among all estmates and REG4 s best among REG estmates. The estmated functon wth h = 1 s qute rough, so we use h = 3 for further comparsons. Except n relatvely large bandwdths (e.g. LPR1 wth h = 5), LPR1 estmates are better than HT and REG estmates on the bass of estmated standard errors for both WEQ and USLE. Fgure 1 shows the relatonshp between x = square root of sze measure of land wth eroson potental and estmated county total (ˆt ) n sampled countes at stage one for WEQ and USLE, on both the orgnal and square-root transformed vertcal scales. In all plots, the weghted lnear regresson ft wth varance proportonal to x 2 (REG4) and the local lnear regresson ft wth bandwdth h = 3 (LPR1) are ncluded. (The square root transformaton n the fgure s ncluded to make dfferences n the fts more dscernble). The LPR1 ft appears qute sensble. It s at least compettve wth the REG estmators, f not better, but requres nether mean nor varance functon specfcaton. 22
23 The same weghts used for WEQ and USLE could be appled to any other study varables obtaned n the Eroson Update Study, wth effcency ncreases over HT f the varable s dependent on the eroson potental sze measure, and wth effcency ncreases over REG f the dependence s non-lnear. 5 Concluson We have developed a nonparametrc survey regresson methodology for two-stage fnte populaton samplng, n whch complete auxlary nformaton s avalable for all frst-stage samplng unts. The estmator s a lnear combnaton of cluster total estmators, wth weghts that are calbrated to known control totals. Ths weghted form s operatonally convenent. Further, the estmator has desrable theoretcal propertes ncludng asymptotc desgn-unbasedness and desgn consstency. Smulaton results show that the nonparametrc estmator domnates several parametrc estmators when the model regresson functon s ncorrectly specfed, whle beng nearly as effcent when the parametrc specfcaton s correct. In an applcaton to data from the 1995 Natonal Resources Inventory Eroson Update Study, the nonparametrc methodology compares favorably wth Horvtz-Thompson and classcal survey regresson estmates for wnd and water eroson. References Bredt, F.J. (2002). Natonal Resources Inventory (NRI), US. Encyclopeda of Envronmetrcs, vol.3, pages A.H. El-Shaaraw and W.W. Pegorsch, eds. Wley. 23
24 Bredt, F.J. and Fuller, W.A. (1999). Desgn of supplemented panel surveys wth applcaton to the Natonal Resources Inventory. Journal of Agrcultural, Bologcal, and Envronmental Statstcs 4, Bredt, F.J. and Opsomer, J.D. (2000). Local polynomal regresson estmators n survey samplng. Annals of Statstcs 28, Brewer, K.R.W. (1963). Rato estmaton n fnte populatons: some results deductble from the assumpton of an underlyng stochastc process. Australan Journal of Statstcs 5, Cassel, C.., Särndal, C.E., and Wretman, J. H. (1976). Some results on generalzed dfferent estmaton and generalzed regresson estmaton for fnte populatons. Bometrka 63, Cassel, C.-., Särndal, C.-E., and Wretman, J. H. (1977). Foundatons of Inference n Survey Samplng. Wley, New York. Chambers, R.L., Dorfman, A.H., and Wehrly, T.E. (1993). Bas robust estmaton n fnte populatons usng nonparametrc calbraton. Journal of the Amercan Statstcal Assocaton 88, Cochran, W.G. (1977). Samplng Technques, 3rd ed. Wley, New York. Dorfman, A.H. (1992). Nonparametrc regresson for estmatng totals n fnte populatons. Proceedngs of the Secton on Survey Research ethods, Amercan Statstcal Assocaton, Fan, J. (1992). Desgn-adaptve nonparametrc regresson. Journal of the Amercan Statstcal Assocaton 87, Fuller, W.A. (1996). Introducton to Statstcal Tme Seres, second edton. Wley, New York. 24
25 Holt, D. and Smth, T.. (1979). Post stratfcaton. Journal of the Royal Statstcal Socety, Seres A 142, Horvtz, D.G. and D.J. Thompson. (1952). A generalzaton of samplng wthout replacement from a fnte unverse. Journal of the Amercan Statstcal Assocaton 47, Robnson, P.. and Särndal, C.-E. (1983). Asymptotc propertes of the generalzed regresson estmaton n probablty samplng. Sankhyā: The Indan Journal of Statstcs, Seres B 45, Royall, R.. (1970). On fnte populaton samplng under certan lnear regresson models. Bometrka 57, Särndal, C.-E., Swensson, B., and Wretman, J.H. (1989). The weghted resdual technque for estmatng the varance of the general regresson estmator of the fnte populaton total. Bometrka 76, Särndal, C.-E., Swensson, B., and Wretman, J. (1992). odel Asssted Survey Samplng, Sprnger, New York. Särndal, C.E. (1980). On π-nverse weghtng versus best lnear unbased weghtng n probablty samplng. Bometrka 67, Wand,.P. and Jones,.C. (1995). Kernel Smoothng, Chapman and Hall, London. Wu, C. and Stter, R.R. (2001). A model-calbraton approach to usng complete auxlary nformaton from survey data. Journal of the Amercan Statstcal Assocaton 96,
26 Appendx: Techncal Dervatons In ths appendx, we frst state and prove three lemmas, then prove the theorems of Secton 2. The proofs, lke those n Bredt and Opsomer (2000), nvolve straghtforward but tedous boundng arguments. In order to examne the desgn propertes of the local polynomal regresson estmator, we use the Taylor lnearzaton technque. Note frst that µ and ˆµ can be expressed as functons of populaton means; that s, for some functon f, µ = f( 1 s, 0) and ˆµ = f( 1 ŝ, δ) where s = and ŝ = s 1 s 2 s 1 = [s 1g ] G 1 g=1 = s 2 = [s 2g ] G 2 g=1 = ŝ 1 ŝ 2 ŝ 1 = [ŝ 1g ] G 1 g=1 = ŝ 2 = [ŝ 2g ] G 2 g=1 = k C k C k C k C 1 h K 1 h K 1 h K 1 h K ( ) xk x (x k x ) g 1 h G 1 g=1 G 2 ( ) xk x (x k x ) g 1 t k h ( xk x h ( xk x h ) (x k x ) g 1 I k g=1 G 1 π k g=1 G 2 ) (x k x ) g 1ˆt k I k = z gk k C π k g=1 G 1 g=1 G 2 = z gk t k k C g=1 = I k z gk k C π k G 1 g=1 G 2 = I k z gkˆt k k C π k g=1. For local polynomal regresson of degree q, G 1 = 2q + 1 and G 2 = q + 1. Now, we defne µ δ by substtutng s for ŝ n ˆµ ; that s, µ δ = f( 1 s, δ). Usng 26
27 the mean value theorem, for some δ (0, δ), and we let µ δ = µ + µ δ ( 2 δ) δ=δ δ 2 (15) R = ˆµ µ δ 1 k C ( ) Ik z 1k 1 1 π k k C ( ) I k z 2k ˆt k t k π k (16) where and z 1k = z 2k = G 1 g=1 G 2 g=1 ˆµ ( 1 z gk ŝ 1g ) ŝ =s ˆµ ( 1 z gk. ŝ 2g ) ŝ =s Lemma 1 Under A1 A8, as. m ] E p [R 2 ( ) 1 = O mh 2 Proof of Lemma 1: Note that m 2 h 2 E p ŝ 1g s 1g 4 = O(1) by the proof of Lemma 3 n Bredt and Opsomer (2000). Defne Then s 2g = k C 1 h K ( xk x h ) (x k x ) g 1 t k I k π k. m 2 h 2 + m2 h 2 E p ŝ 2g s 2g = B 1 + B 2 + B 3. 4 = m2 h 2 E p ŝ 2g s 2g E p (ŝ 2g s 2g ) 2 ( s 2g s 2g ) m2 h 2 E p s 2g s 2g 4 27
28 Now B 3 = O(1) by Lemma 3 n Bredt and Opsomer (2000), so to show that B 1 + B 2 + B 3 = O(1), t suffces to show that B 1 = O(1), then use Cauchy-Schwarz on B 2. Usng the ndependence of the second-stage desgn, B 1 = m2 h 2 k l 1 4 h 4 ( ) ( ) K 2 xk x K 2 xl x h h (x k x ) 2(g 1) (x l x ) 2(g 1) V kv l π kl π 2 k π2 l + m2 h 2 ( ) 1 4 h 4 K 4 xk x (x k x ) 4(g 1) E II(ˆt k t k ) 4 k C h πk 3 c 1 2 I {x h x k x +h } + c 2 I {x h x k x +h } h k C 2 h 2, k C whch s bounded by Lemma 2() n Bredt and Opsomer (2000). The assumptons of Theorem of Fuller (1996) wth α = 1, s = 4, a 4 = O(m 2 h (2+τ) ), and expectaton 1 E p[ ] are then met for the sequence {R 2 }. Snce ths functon and ts frst three dervatves wth respect to the elements of 1 s evaluate to zero, the result follows. Lemma 2 Under A1 A8, lm 1 E p (ˆµ µ ) 2 = 0. Proof of Lemma 2: We wrte 1 E p (ˆµ µ ) 2 = 1 E p (ˆµ µ δ) E p (µ δ µ ) E p [(ˆµ µ δ)(µ δ µ )]. (17) By (16), the frst term on the rght sde of (17), 1 E p (ˆµ µ δ) 2 28
29 = 1 3 k,l C k,l C,k C z 1k z 1l π kl π k π l π k π l z 2k z 2l E p [( ˆt k I k π k t k z 1k E p [ R ( Ik π k 1 E p [R 2 k,l C ) ( ˆt l I l π l t l z 1k z 2l E p [( Ik )] ) ( )] I l 1 ˆt l t l π k π l )] + 2 [ ( )] I k 2 z 2k E p R ˆt k t k π,k C k ]. (18) Usng the proof of Lemma 4 n Bredt and Opsomer (2000), the frst term on the rght sde of (18) converges to zero as. Next, 1 3 [( ) ( I k I l z 2k z 2l E p ˆt k t k ˆt l π k )] t l π l ) k,l C 1 ( = 3 z2k 2 t 2 1 π k k + V k π k C k π k c 1 1 I {x h x k x +h } λh 2h k C + c 2 λ 2 max j π j 1,j C: j k C 0 as k l I {x h x k x +h } h z 2k z 2l t k t l π kl π k π l π k π l 2 under A5, A6, A8, ndependence of the second-stage desgn, and usng Lemma 2() n Bredt and Opsomer (2000). Also, 1 E [ ] p R 2 converges to zero by Lemma 1, and then the remanng cross-product terms go to zero by the Cauchy-Schwarz nequalty. By equaton (15), the second term on the rght sde of (17), 1 E p(µ δ µ ) 2, converges to zero as. The last cross-product term of (17) goes to zero by the Cauchy-Schwarz nequalty, and hence the result follows. Lemma 3 Under A1 A8, m lm 2 E p,j C ( (ˆµ µ δ)(ˆµ j µ δj) 29 1 I ) ( 1 I ) j = 0. π j
30 Proof of Lemma 3: By (16), m 2 E p ( (ˆµ µ δ)(ˆµ j µ δj) 1 I ) ( 1 I ) j π,j C π j = m [ ( 4 z 1k z 1jl E p 1 I ) ( 1 I ) ( j 1 I ) ( k 1 I ) ] l π,j,k,l C π j π k π l + 2m [ ( 4 z 1k z 2jl E p 1 I ) ( 1 I ) ( j 1 I ) ( ) k I ] l t l ˆt l π,j,k,l C π j π k π l + m [ ( 4 z 2k z 2jl E p 1 I ) ( 1 I ) ( ) ( ) j I k I ] l t k ˆt k t l ˆt l π,j,k,l C π j π k π l 2m [ ( 3 z 1k E p R j 1 I ) ( 1 I ) ( j 1 I ) ] k 2m 3,j,k C,j,k C + m 2,j C [ ( z 2k E p R j 1 I ) ( 1 I ) ( ) j I ] k t k ˆt k E p [ R R j ( 1 I = b 1 + b 2 + b 3 + b 4 + b 5 + b 6. π j ) ( 1 I j π j π j )] π k π k Here, the frst term, b 1, s dentcal to that of the proof of Lemma 5 n Bredt and Opsomer (2000) and thus converges to zero as. Next, b 3 = m 4,j,k,l C:k l + m 4,j,k C = m 4,j,k,l C:k l + m 4,j,k C + m 4,j,k C + m 4,j,k C [ ( z 2k z 2jl E I 1 I ) ( 1 I j [ ( z 2k z 2jk E I 1 I ) ( 1 I j π j π j ) { ) t k ( 1 I k t 2 k π k ) t l ( 1 I l ( 1 I ) 2 k I k + V k π k πk 2 z 2k z 2jl t k t l π j π k π l E I [(I ) (I j π j ) (I k π k ) (I l π l )] z 2k z 2jk t 2 k π j π 2 k z 2k z 2jk V k π j π 2 k E I [ (I ) (I j π j ) (I k π k ) 2] E I [(I ) (I j π j ) (I k π k )] ) ] π l }] z 2k z 2jk V k π j π k E I [(I ) (I j π j )] (19) Each of the terms on the rght sde of (19) converges to zero as, followng the same boundng arguments as n Lemma 1. We omt the detals. The b 6 term 30
31 converges to zero by Lemma 1 and A6, and then the remanng cross-product terms go to zero usng the Cauchy-Schwarz nequalty. Proof of Theorem 1: By arkov s nequalty, t suffces to show that We wrte lm E p y t y t = 0. t y t y = t µ ( ) I 1 + ˆt t I + ( ˆµ µ π 1 I ). Then E p t y t y E p + { t µ ( ) ( I 1 + E p [ (ˆµ µ ) 2 ] [ E p E p ˆt t (1 π 1 I ) 2 ) 2 1/2 I ]} 1/2. (20) The frst term on the rght of (20) converges to zero as under A1 A6 and the fact that lm sup 1 (t µ ) 2 <, followng the argument of Theorem 1 n Robnson and Särndal (1983). Usng A6, A8, and the ndependence assumpton of the second-stage desgn, E p ( ˆt t ) 2 ( I = V I = 1 2 E II [ˆt t ] V 1 λ ) [ ] I V II (ˆt ) I + E I 2 π 2 V 0 as. Thus, the second term on the rght of (20) converges to zero as. Under A6, E p [ (1 π 1 I ) 2 ] = (1 ) π 2 1 λ. Combnng ths wth Lemma 2, the last term on the rght of (20) converges to zero as, and the theorem follows. 31
32 Proof of Theorem 2: Let ) ( t m 1/2 y t y = a + b + c where a = m 1/2 b = m 1/2 c = m 1/2 t µ δ ˆµ µ δ ˆt t ) 1, ( 1 I ), ( I I. Then ) 2 ( t y t [ ] [ ] [ ] y me p = E p a 2 + E p b 2 + E p c 2 + 2E p [a b ] + 2E p [a c ] + 2E p [b c ]. Usng equaton (15), E p [a 2 ] = m 2,j C (t µ )(t j µ j ) j π j π j + o(1), and by Lemma 3, E p [b 2 ] = m 2 E p,j C 0 as. ( (ˆµ µ δ)(ˆµ j µ δj) 1 I ) ( 1 I ) j π j Next, ( ] E p [c 2 = me p ˆt t I whch remans bounded by assumpton, and ) 2 = m 2 V 1 1 λ V, E p [a c ] = E I [a E II [c ]] = 0 because E II [ˆt ] = t for all C. The remanng cross-product terms converge to zero by the Cauchy-Schwarz nequalty, and hence the result s proved. 32
33 Proof of Theorem 3: We wrte me p ˆV( 1 t y ) ASE( 1 t y ) me p 1 2 +me p 1 2 +me p 1 2 +me p 1 2,j C,j C,j C,j C +me p 1 2 (t µ )(t j µ j ) j π j π j I I j π j j { } πj π j I I j 2(ˆt µ )(µ j ˆµ j ) + (µ ˆµ )(µ j ˆµ j ) π j j 2(t µ )(ˆt j t j ) j π j π j (ˆt t )(ˆt j t j ) j π j π j ˆV I 1 2 V = A + B + C + D + E. I I j j I I j j 1 2 ( 1 ) 1 V Now A 0 as by the proof of Theorem 3 n Bredt and Opsomer (2000). Next, B 2m E p mE p 2 + m E p 1 +me p 1 2 (ˆt µ )(µ ˆµ ) 1,j C: j I (ˆt µ )(µ j ˆµ j ) j π j π j (µ ˆµ ) 2 1,j C: j I (µ ˆµ )(µ j ˆµ j ) j π j π j ( 2m λ 2 + 2m max,j C: j j π j λ 2 λ ( m + λ 2 + m max,j C: j j π j λ 2 λ 0 as ) { ) E p I I j j I I j j [ V + (t µ ) 2] [ (µ ˆµ ) 2] E p [ (µ ˆµ ) 2] } 1/2 usng A5, A6, A8, A9, and Lemma 2. 33
34 For C, consder m 2 E p 1 2 (t µ )(ˆt j t j ) π 2 j π j I I j π,j C π j j = m 2 E p (t µ )(t k µ k )(ˆt t )(ˆt k t k ) 1 1 π k I I k 4 π,k C π k π k +2m 2 E p (t µ )(ˆt t )(t k µ k )(ˆt l t l ) 1 π kl π k π l I I k I l 4 π k,l C:k l π k π l π kl +m 2 E p (t µ )(ˆt j t j )(t k µ k )(ˆt l t l ) j π j π kl π k π l I I j I k I l 4 π,j C: j k,l C:k l π j π k π l j π kl = C 1 + C 2 + C 3. Here, C 1 = m 2 E I [ (t µ ) 2 V 4 ( ) ] 1 2 π I π 2 m 2 1 { 1 2 λ 4 (t µ ) 4 1 V 2 } 1/2 0 as by A5, A6, A8, and ndependence of the second-stage desgn, and C 3 = m 2 E I,j,k C: j,k j (t µ )(t k µ k )V j 4 (m max,j C: j j π j ) 2 λ 4 λ as V j π j π j π kj π k π j π k π j (t µ ) 2 I I j I k j π kj by A6, A8, A9, and the ndependence assumpton of the second-stage desgn. Then C 2 goes to zero as by the Cauchy-Schwarz nequalty, and t follows that C 0 as. Next, for D m 2 1 E p 2 (ˆt t )(ˆt j t j ) j π j π,j C π j = m 2 E II [(ˆt t ) 4 ( ] 1 π 4 π ) m I I j 1 j 2,k C: k ( 1 ) 1 V V k 1 2 V 1 π k π k k π k
35 +2 m2 4 m 2 2 λ m,j C: j ( ) 2 πj π j 1 V V j m2 π j j 4 E II [(ˆt t ) 4 ] m max,j C: j j π k λ as,k C V V k (m max,j C: j j π k ) 2 λ 4 λ 2 V 2 1 π k π k V 2 by A5, A6, A8, A9, and the ndependence assumpton of the second-stage desgn. Thus, D 0 as. Fnally, we consder E : m 2 E p ( 1 2 = m2 4 m2 1 2 λ ˆV I 1 0 as ) 2 2 V E II [ ˆV 2 ] 1 + m2 4,j C: j V V j j π j m2 4,j C E II [ ˆV 2] + m m max,j C: j j π j λ 2 V V j V 2 m2 2 1 V 2 under A5, A6, A8, and the ndependence assumpton of the second-stage desgn, and so E converges to zero as. The result s proved. 35
36 Study Varable σ h HT REG REG3 PS LPR0 KERN CDW lnear quadratc bump jump exponental cycle cycle Table 1: Rato of desgn SE of HT, REG, REG3, PS, LPR0, KERN, and CDW estmators to desgn SE of LPR1 estmator, based on 1000 replcatons of two-stage element samplng from a fnte populaton wth = 1000 clusters and N (random cluster sze) elements wthn each cluster. Sample sze of clusters s m = 100 and sample sze of elements wthn each cluster s n = 0.5N + 1. Nonparametrc estmators are computed wth bandwdth h and Epanechnkov kernel. 36
37 WEQ Transformed WEQ Tons/Acre/Yr REG4 LPR1( h= 3 ) sqrt( Tons/Acre/Yr ) REG4 LPR1( h= 3 ) sqrt( sze measure ) sqrt( sze measure ) USLE Transformed USLE Tons/Acre/Yr REG4 LPR1( h= 3 ) sqrt( Tons/Acre/Yr ) REG4 LPR1( h= 3 ) sqrt( sze measure ) sqrt( sze measure ) Fgure 1: Relatonshp between x = square root of sze measure of land wth eroson potental and estmated county total (ˆt ) n stage-one sampled countes for wnd eroson (WEQ) and water eroson (USLE), on both orgnal (left column) and square root (rght column) vertcal scales. Dashed curve s weghted lnear regresson ft (REG4) and sold curve s local lnear regresson ft (LPR1 wth h = 3). 37
38 WEQ USLE HT (49.3) (31.8) REG2 ν(x) x (50.7) (26.5) REG4 ν(x) x (50.1) (26.5) REG8 ν(x) x (50.3) (27.6) LPR1 h = (47.4) (24.4) LPR1 h = (48.8) (25.2) LPR1 h = (48.7) (27.6) Table 2: Horvtz-Thompson (HT), weghted lnear regresson (REG2, REG4, REG8), and local lnear regresson (LPR1 wth h = 1, 3, 5) estmates for wnd eroson (WEQ) and water eroson (USLE) totals n mllons of tons/acre/year. The numbers n parentheses are estmated standard errors. 38
Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)
I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes
More informationEstimation: Part 2. Chapter GREG estimation
Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the
More informationA Robust Method for Calculating the Correlation Coefficient
A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal
More informationThe Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction
ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also
More informationEfficient nonresponse weighting adjustment using estimated response probability
Effcent nonresponse weghtng adjustment usng estmated response probablty Jae Kwang Km Department of Appled Statstcs, Yonse Unversty, Seoul, 120-749, KOREA Key Words: Regresson estmator, Propensty score,
More informationDurban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications
Durban Watson for Testng the Lack-of-Ft of Polynomal Regresson Models wthout Replcatons Ruba A. Alyaf, Maha A. Omar, Abdullah A. Al-Shha ralyaf@ksu.edu.sa, maomar@ksu.edu.sa, aalshha@ksu.edu.sa Department
More informationA note on regression estimation with unknown population size
Statstcs Publcatons Statstcs 6-016 A note on regresson estmaton wth unknown populaton sze Mchael A. Hdroglou Statstcs Canada Jae Kwang Km Iowa State Unversty jkm@astate.edu Chrstan Olver Nambeu Statstcs
More informationConditional and unconditional models in modelassisted estimation of finite population totals
Unversty of Wollongong Research Onlne Faculty of Informatcs - Papers Archve) Faculty of Engneerng and Informaton Scences 2011 Condtonal and uncondtonal models n modelasssted estmaton of fnte populaton
More informationEconomics 130. Lecture 4 Simple Linear Regression Continued
Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do
More informationLectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix
Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could
More information4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA
4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected
More informationSee Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)
Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes
More informationNonparametric model calibration estimation in survey sampling
Ames February 18, 004 Nonparametrc model calbraton estmaton n survey samplng M. Govanna Ranall Department of Statstcs, Colorado State Unversty (Jont work wth G.E. Montanar, Dpartmento d Scenze Statstche,
More informationComputation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models
Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,
More information3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X
Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number
More informationChapter 11: Simple Linear Regression and Correlation
Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests
More informationLinear Approximation with Regularization and Moving Least Squares
Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...
More informationAn (almost) unbiased estimator for the S-Gini index
An (almost unbased estmator for the S-Gn ndex Thomas Demuynck February 25, 2009 Abstract Ths note provdes an unbased estmator for the absolute S-Gn and an almost unbased estmator for the relatve S-Gn for
More informationSmall Area Estimation for Business Surveys
ASA Secton on Survey Research Methods Small Area Estmaton for Busness Surveys Hukum Chandra Southampton Statstcal Scences Research Insttute, Unversty of Southampton Hghfeld, Southampton-SO17 1BJ, U.K.
More informationParametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010
Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton
More informationComputing MLE Bias Empirically
Computng MLE Bas Emprcally Kar Wa Lm Australan atonal Unversty January 3, 27 Abstract Ths note studes the bas arses from the MLE estmate of the rate parameter and the mean parameter of an exponental dstrbuton.
More informationChapter 13: Multiple Regression
Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to
More informationMore metrics on cartesian products
More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of
More informationPrimer on High-Order Moment Estimators
Prmer on Hgh-Order Moment Estmators Ton M. Whted July 2007 The Errors-n-Varables Model We wll start wth the classcal EIV for one msmeasured regressor. The general case s n Erckson and Whted Econometrc
More informationMultivariate Ratio Estimator of the Population Total under Stratified Random Sampling
Open Journal of Statstcs, 0,, 300-304 ttp://dx.do.org/0.436/ojs.0.3036 Publsed Onlne July 0 (ttp://www.scrp.org/journal/ojs) Multvarate Rato Estmator of te Populaton Total under Stratfed Random Samplng
More informationLINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity
LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have
More informationEcon Statistical Properties of the OLS estimator. Sanjaya DeSilva
Econ 39 - Statstcal Propertes of the OLS estmator Sanjaya DeSlva September, 008 1 Overvew Recall that the true regresson model s Y = β 0 + β 1 X + u (1) Applyng the OLS method to a sample of data, we estmate
More informationx = , so that calculated
Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to
More informationDiscussion of Extensions of the Gauss-Markov Theorem to the Case of Stochastic Regression Coefficients Ed Stanek
Dscusson of Extensons of the Gauss-arkov Theorem to the Case of Stochastc Regresson Coeffcents Ed Stanek Introducton Pfeffermann (984 dscusses extensons to the Gauss-arkov Theorem n settngs where regresson
More informationLimited Dependent Variables
Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages
More informationSTAT 3008 Applied Regression Analysis
STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,
More informationREPLICATION VARIANCE ESTIMATION UNDER TWO-PHASE SAMPLING IN THE PRESENCE OF NON-RESPONSE
STATISTICA, anno LXXIV, n. 3, 2014 REPLICATION VARIANCE ESTIMATION UNDER TWO-PHASE SAMPLING IN THE PRESENCE OF NON-RESPONSE Muqaddas Javed 1 Natonal College of Busness Admnstraton and Economcs, Lahore,
More informationA Comparative Study for Estimation Parameters in Panel Data Model
A Comparatve Study for Estmaton Parameters n Panel Data Model Ahmed H. Youssef and Mohamed R. Abonazel hs paper examnes the panel data models when the regresson coeffcents are fxed random and mxed and
More informationLecture 10 Support Vector Machines II
Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed
More informationj) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1
Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons
More informationFirst Year Examination Department of Statistics, University of Florida
Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve
More information1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands
Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of
More informationNegative Binomial Regression
STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...
More informationPopulation element: 1 2 N. 1.1 Sampling with Replacement: Hansen-Hurwitz Estimator(HH)
Chapter 1 Samplng wth Unequal Probabltes Notaton: Populaton element: 1 2 N varable of nterest Y : y1 y2 y N Let s be a sample of elements drawn by a gven samplng method. In other words, s s a subset of
More informationANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)
Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of
More informationKernel Methods and SVMs Extension
Kernel Methods and SVMs Extenson The purpose of ths document s to revew materal covered n Machne Learnng 1 Supervsed Learnng regardng support vector machnes (SVMs). Ths document also provdes a general
More informationLinear Regression Analysis: Terminology and Notation
ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented
More informationChapter 6. Supplemental Text Material
Chapter 6. Supplemental Text Materal S6-. actor Effect Estmates are Least Squares Estmates We have gven heurstc or ntutve explanatons of how the estmates of the factor effects are obtaned n the textboo.
More informationPolynomial Regression Models
LINEAR REGRESSION ANALYSIS MODULE XII Lecture - 6 Polynomal Regresson Models Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Test of sgnfcance To test the sgnfcance
More informationChapter 5 Multilevel Models
Chapter 5 Multlevel Models 5.1 Cross-sectonal multlevel models 5.1.1 Two-level models 5.1.2 Multple level models 5.1.3 Multple level modelng n other felds 5.2 Longtudnal multlevel models 5.2.1 Two-level
More informationAPPENDIX A Some Linear Algebra
APPENDIX A Some Lnear Algebra The collecton of m, n matrces A.1 Matrces a 1,1,..., a 1,n A = a m,1,..., a m,n wth real elements a,j s denoted by R m,n. If n = 1 then A s called a column vector. Smlarly,
More informationMaximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models
ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Maxmum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models
More informationNUMERICAL DIFFERENTIATION
NUMERICAL DIFFERENTIATION 1 Introducton Dfferentaton s a method to compute the rate at whch a dependent output y changes wth respect to the change n the ndependent nput x. Ths rate of change s called the
More informationComparison of Regression Lines
STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence
More informationPsychology 282 Lecture #24 Outline Regression Diagnostics: Outliers
Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.
More informationGlobal Sensitivity. Tuesday 20 th February, 2018
Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values
More informationComposite Hypotheses testing
Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter
More informationELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM
ELASTIC WAVE PROPAGATION IN A CONTINUOUS MEDIUM An elastc wave s a deformaton of the body that travels throughout the body n all drectons. We can examne the deformaton over a perod of tme by fxng our look
More information/ n ) are compared. The logic is: if the two
STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence
More informationConvergence of random processes
DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large
More information2E Pattern Recognition Solutions to Introduction to Pattern Recognition, Chapter 2: Bayesian pattern classification
E395 - Pattern Recognton Solutons to Introducton to Pattern Recognton, Chapter : Bayesan pattern classfcaton Preface Ths document s a soluton manual for selected exercses from Introducton to Pattern Recognton
More informationCredit Card Pricing and Impact of Adverse Selection
Credt Card Prcng and Impact of Adverse Selecton Bo Huang and Lyn C. Thomas Unversty of Southampton Contents Background Aucton model of credt card solctaton - Errors n probablty of beng Good - Errors n
More informationNow we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity
ECON 48 / WH Hong Heteroskedastcty. Consequences of Heteroskedastcty for OLS Assumpton MLR. 5: Homoskedastcty var ( u x ) = σ Now we relax ths assumpton and allow that the error varance depends on the
More informationBayesian predictive Configural Frequency Analysis
Psychologcal Test and Assessment Modelng, Volume 54, 2012 (3), 285-292 Bayesan predctve Confgural Frequency Analyss Eduardo Gutérrez-Peña 1 Abstract Confgural Frequency Analyss s a method for cell-wse
More informationx i1 =1 for all i (the constant ).
Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by
More informationSTAT 511 FINAL EXAM NAME Spring 2001
STAT 5 FINAL EXAM NAME Sprng Instructons: Ths s a closed book exam. No notes or books are allowed. ou may use a calculator but you are not allowed to store notes or formulas n the calculator. Please wrte
More informationSimulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests
Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth
More informationAppendix B: Resampling Algorithms
407 Appendx B: Resamplng Algorthms A common problem of all partcle flters s the degeneracy of weghts, whch conssts of the unbounded ncrease of the varance of the mportance weghts ω [ ] of the partcles
More informationBias-correction under a semi-parametric model for small area estimation
Bas-correcton under a sem-parametrc model for small area estmaton Laura Dumtrescu, Vctora Unversty of Wellngton jont work wth J. N. K. Rao, Carleton Unversty ICORS 2017 Workshop on Robust Inference for
More informationNumerical Heat and Mass Transfer
Master degree n Mechancal Engneerng Numercal Heat and Mass Transfer 06-Fnte-Dfference Method (One-dmensonal, steady state heat conducton) Fausto Arpno f.arpno@uncas.t Introducton Why we use models and
More informationNon-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT
Malaysan Journal of Mathematcal Scences 8(S): 37-44 (2014) Specal Issue: Internatonal Conference on Mathematcal Scences and Statstcs 2013 (ICMSS2013) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal
More informationFoundations of Arithmetic
Foundatons of Arthmetc Notaton We shall denote the sum and product of numbers n the usual notaton as a 2 + a 2 + a 3 + + a = a, a 1 a 2 a 3 a = a The notaton a b means a dvdes b,.e. ac = b where c s an
More informationDUE: WEDS FEB 21ST 2018
HOMEWORK # 1: FINITE DIFFERENCES IN ONE DIMENSION DUE: WEDS FEB 21ST 2018 1. Theory Beam bendng s a classcal engneerng analyss. The tradtonal soluton technque makes smplfyng assumptons such as a constant
More informationMaximum Likelihood Estimation of Binary Dependent Variables Models: Probit and Logit. 1. General Formulation of Binary Dependent Variables Models
ECO 452 -- OE 4: Probt and Logt Models ECO 452 -- OE 4 Mamum Lkelhood Estmaton of Bnary Dependent Varables Models: Probt and Logt hs note demonstrates how to formulate bnary dependent varables models for
More informationNotes on Frequency Estimation in Data Streams
Notes on Frequency Estmaton n Data Streams In (one of) the data streamng model(s), the data s a sequence of arrvals a 1, a 2,..., a m of the form a j = (, v) where s the dentty of the tem and belongs to
More informationDepartment of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution
Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable
More informationExponential Type Product Estimator for Finite Population Mean with Information on Auxiliary Attribute
Avalable at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 193-9466 Vol. 10, Issue 1 (June 015), pp. 106-113 Applcatons and Appled Mathematcs: An Internatonal Journal (AAM) Exponental Tpe Product Estmator
More informationProperties of Least Squares
Week 3 3.1 Smple Lnear Regresson Model 3. Propertes of Least Squares Estmators Y Y β 1 + β X + u weekly famly expendtures X weekly famly ncome For a gven level of x, the expected level of food expendtures
More informationSampling Theory MODULE VII LECTURE - 23 VARYING PROBABILITY SAMPLING
Samplng heory MODULE VII LECURE - 3 VARYIG PROBABILIY SAMPLIG DR. SHALABH DEPARME OF MAHEMAICS AD SAISICS IDIA ISIUE OF ECHOLOGY KAPUR he smple random samplng scheme provdes a random sample where every
More informationNonparametric Regression Estimation of Finite Population Totals under Two-Stage Sampling
Nonparametric Regression Estimation of Finite Population Totals under Two-Stage Sampling Ji-Yeon Kim Iowa State University F. Jay Breidt Colorado State University Jean D. Opsomer Colorado State University
More informationUsing T.O.M to Estimate Parameter of distributions that have not Single Exponential Family
IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran
More informationA General Class of Selection Procedures and Modified Murthy Estimator
ISS 684-8403 Journal of Statstcs Volume 4, 007,. 3-9 A General Class of Selecton Procedures and Modfed Murthy Estmator Abdul Bast and Muhammad Qasar Shahbaz Abstract A new selecton rocedure for unequal
More informationEcon107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10)
I. Defnton and Problems Econ7 Appled Econometrcs Topc 9: Heteroskedastcty (Studenmund, Chapter ) We now relax another classcal assumpton. Ths s a problem that arses often wth cross sectons of ndvduals,
More informationJoint Statistical Meetings - Biopharmaceutical Section
Iteratve Ch-Square Test for Equvalence of Multple Treatment Groups Te-Hua Ng*, U.S. Food and Drug Admnstraton 1401 Rockvlle Pke, #200S, HFM-217, Rockvlle, MD 20852-1448 Key Words: Equvalence Testng; Actve
More informationMarkov Chain Monte Carlo Lecture 6
where (x 1,..., x N ) X N, N s called the populaton sze, f(x) f (x) for at least one {1, 2,..., N}, and those dfferent from f(x) are called the tral dstrbutons n terms of mportance samplng. Dfferent ways
More informationEconometrics of Panel Data
Econometrcs of Panel Data Jakub Mućk Meetng # 8 Jakub Mućk Econometrcs of Panel Data Meetng # 8 1 / 17 Outlne 1 Heterogenety n the slope coeffcents 2 Seemngly Unrelated Regresson (SUR) 3 Swamy s random
More informationLecture 4 Hypothesis Testing
Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to
More informationBOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS. M. Krishna Reddy, B. Naveen Kumar and Y. Ramu
BOOTSTRAP METHOD FOR TESTING OF EQUALITY OF SEVERAL MEANS M. Krshna Reddy, B. Naveen Kumar and Y. Ramu Department of Statstcs, Osmana Unversty, Hyderabad -500 007, Inda. nanbyrozu@gmal.com, ramu0@gmal.com
More informationLecture 3 Stat102, Spring 2007
Lecture 3 Stat0, Sprng 007 Chapter 3. 3.: Introducton to regresson analyss Lnear regresson as a descrptve technque The least-squares equatons Chapter 3.3 Samplng dstrbuton of b 0, b. Contnued n net lecture
More informationn α j x j = 0 j=1 has a nontrivial solution. Here A is the n k matrix whose jth column is the vector for all t j=0
MODULE 2 Topcs: Lnear ndependence, bass and dmenson We have seen that f n a set of vectors one vector s a lnear combnaton of the remanng vectors n the set then the span of the set s unchanged f that vector
More informationBasically, if you have a dummy dependent variable you will be estimating a probability.
ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy
More informationStatistics for Economics & Business
Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable
More informationStructure and Drive Paul A. Jensen Copyright July 20, 2003
Structure and Drve Paul A. Jensen Copyrght July 20, 2003 A system s made up of several operatons wth flow passng between them. The structure of the system descrbes the flow paths from nputs to outputs.
More informationInner Product. Euclidean Space. Orthonormal Basis. Orthogonal
Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,
More informationLecture 4: Universal Hash Functions/Streaming Cont d
CSE 5: Desgn and Analyss of Algorthms I Sprng 06 Lecture 4: Unversal Hash Functons/Streamng Cont d Lecturer: Shayan Oves Gharan Aprl 6th Scrbe: Jacob Schreber Dsclamer: These notes have not been subjected
More informationLossy Compression. Compromise accuracy of reconstruction for increased compression.
Lossy Compresson Compromse accuracy of reconstructon for ncreased compresson. The reconstructon s usually vsbly ndstngushable from the orgnal mage. Typcally, one can get up to 0:1 compresson wth almost
More informationDepartment of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6
Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.
More informationLena Boneva and Oliver Linton. January 2017
Appendx to Staff Workng Paper No. 640 A dscrete choce model for large heterogeneous panels wth nteractve fxed effects wth an applcaton to the determnants of corporate bond ssuance Lena Boneva and Olver
More informationExplaining the Stein Paradox
Explanng the Sten Paradox Kwong Hu Yung 1999/06/10 Abstract Ths report offers several ratonale for the Sten paradox. Sectons 1 and defnes the multvarate normal mean estmaton problem and ntroduces Sten
More informationFactor models with many assets: strong factors, weak factors, and the two-pass procedure
Factor models wth many assets: strong factors, weak factors, and the two-pass procedure Stanslav Anatolyev 1 Anna Mkusheva 2 1 CERGE-EI and NES 2 MIT December 2017 Stanslav Anatolyev and Anna Mkusheva
More informationUsing the estimated penetrances to determine the range of the underlying genetic model in casecontrol
Georgetown Unversty From the SelectedWorks of Mark J Meyer 8 Usng the estmated penetrances to determne the range of the underlyng genetc model n casecontrol desgn Mark J Meyer Neal Jeffres Gang Zheng Avalable
More informationRockefeller College University at Albany
Rockefeller College Unverst at Alban PAD 705 Handout: Maxmum Lkelhood Estmaton Orgnal b Davd A. Wse John F. Kenned School of Government, Harvard Unverst Modfcatons b R. Karl Rethemeer Up to ths pont n
More informationTime-Varying Systems and Computations Lecture 6
Tme-Varyng Systems and Computatons Lecture 6 Klaus Depold 14. Januar 2014 The Kalman Flter The Kalman estmaton flter attempts to estmate the actual state of an unknown dscrete dynamcal system, gven nosy
More informationCopyright 2017 by Taylor Enterprises, Inc., All Rights Reserved. Adjusted Control Limits for P Charts. Dr. Wayne A. Taylor
Taylor Enterprses, Inc. Control Lmts for P Charts Copyrght 2017 by Taylor Enterprses, Inc., All Rghts Reserved. Control Lmts for P Charts Dr. Wayne A. Taylor Abstract: P charts are used for count data
More informationChapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.
Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the
More information