Variance Estimation for the General Regression Estimator

Size: px
Start display at page:

Download "Variance Estimation for the General Regression Estimator"

Transcription

1 Varance Estmaton for the General Regresson Estmator Rchard Vallant Westat 1650 Research Boulevard Rockvlle MD 0850 and Jont Program for Survey Methodology Unversty of Maryland Unversty of Mchgan February 00

2 1 ABSTRACT A varety of estmators of the varance of the general regresson (GREG) estmator of a mean have been proposed n the samplng lterature, manly wth the goal of estmatng the desgn-based varance. Estmators can be easly constructed that, under certan condtons, are approxmately unbased for both the desgn-varance and the model-varance. Several dualpurpose estmators are studed here n sngle-stage samplng. These choces are robust estmators of a model-varance even f the model that motvates the GREG has an ncorrect varance parameter. A key feature of the robust estmators s the adjustment of squared resduals by factors analogous to the leverages used n standard regresson analyss. We also show that the deleteone jackknfe mplctly ncludes the leverage adjustments and s a good choce from ether the desgn-based or model-based perspectve. In a set of smulatons, these varance estmators have small bas and produce confdence ntervals wth near-nomnal coverage rates for several samplng methods, sample szes, and populatons n sngle-stage samplng. We also present smulaton results for a skewed populaton where all varance estmators perform poorly. Samples that do not adequately represent the unts wth large values lead to estmated means that are too small, varance estmates that are too small, and confdence ntervals that cover at far less than the nomnal rate. These defects need to be avoded at the desgn stage by selectng samples that cover the extreme unts well. However, n populatons wth nadequate desgn nformaton ths wll not be feasble. KEY WORDS: Confdence nterval coverage; Hat matrx; Jackknfe; Leverage; Model unbased; Skewness

3 1 1. Introducton Robust varance estmaton s a key consderaton n the predcton approach to fnte populaton samplng. Vallant, Dorfman, and Royall (000) synthesze much of the model-based lterature. In that approach, a workng model s formulated that s used to construct a pont estmator of a mean or total. Varance estmators are created that are robust n the sense of beng approxmately model-unbased and consstent for the model-varance even when the varance specfcaton n the workng model s ncorrect. In ths paper, that approach s extended to the general regresson estmator (GREG) to construct varance estmators that are approxmately model-unbased but are also approxmately desgn-unbased n sngle-stage samplng. A number of alternatves are compared ncludng the jackknfe and some varants of the jackknfe. We wll use a partcular class of lnear models along wth Bernoull or Posson samplng as motvaton for the varance estmators. However, some of these estmators can often be successfully appled n practce to sngle-stage desgns where selectons are not ndependent. Assocated wth each unt n the populaton s a target varable Y and a p-vector of auxlary varables = ( 1,, p) x x x where = 1,, N. The populaton vector of totals of the auxlares s x = ( x1,, xp) T T T where T xk = x, k = 1,, p. The general regresson N = 1 k estmator, defned below, s motvated by a lnear model n whch the Y s are ndependent random varables wth E var ( Y ) ( Y ) M = xβ. (1.1) = v M In most stuatons (1.1) s a workng model that s lkely to be ncorrect to some degree.

4 Assume that a probablty sample s s selected and that the selecton probablty of sample unt s P( δ 1) = = π where δ s a 0-1 ndcator for whether a unt s n the sample or not. We assume that the sample selecton mechansm s gnorable. Roughly speakng, gnorablty means that the jont dstrbuton of the Y s and the sample ndcators, gven the x s, can be factored nto the product of the dstrbuton for Y gven x and the dstrbuton for the ndcators gven x (see Sugden and Smth 1984 for a formal defnton). In that case, model-based nference can proceed usng the model and gnorng the selecton mechansm. The n-vector of targets for the sample unts s = ( ) auxlares for the sample unts s X = ( x x ) probabltes as = dag( π ) s 1,, s n Y s Y1,, Y n, and the n p matrx of. Defne the dagonal matrx of selecton Π, s, and the dagonal matrx of model-varances as V s = dag( v ). The GREG estmator of the total, T =, s then defned as the Horvtz- N Y = 1 Thompson estmator or π -estmator, T π = Y π, plus an adjustment: s ( ) T = T π + B T T (1.) G x x where B = A XV Π Y wth π s s s s s 1 1 πs = s s s s = A XV Π X, and T x x π. The GREG s estmator can also be wrtten as 1 1 wth ( π ) s s s s x x s 1 TG =g s s Y s Π (1.3) g = V XA T T + 1 and 1 s beng an n-vector of 1 s. Expresson (1.3) wll be useful for subsequent calculatons. A varant of the GREG, referred to as a cosmetc estmator, was ntroduced by Särndal and Wrght (1984) and amplfed by Brewer (1995, 1999). A cosmetc estmator also has desgn-

5 3 based and model-based nterpretatons. The varance estmators n ths paper could also be adapted to cover cosmetc estmaton. Assumng that N s known, the GREG estmator of the mean s smply Y = T N. We wll concentrate on the analyss of Y G. (In some stuatons, partcularly ones where mult-stage G G samplng s used, the populaton sze s unknown and an estmate, N, must be used n the denomnator of Y G. The followng analyss for the mean does not apply n that case.) Ether quanttatve or qualtatve auxlares (or both) can be used n the GREG. If a qualtatve varable lke gender (male or female) s used, then two or more columns n n whch case a generalzed nverse, denoted by X s wll be lnearly dependent, A π s, wll be used n (1.) and (1.3). Note that, although A π s s not unque, the GREG estmator Y G s nvarant to the choce of generalzed nverse. The proof s smlar to Theorem n Vallant, Dorfman, and Royall (000). The GREG estmator s model-unbased under (1.1) and s approxmately desgnunbased n large probablty samples. Note that the model-unbasedness requres only that ( ) E Y = xβ; f the varance parameters n (1.1) are msspecfed, the GREG wll stll be M model-unbased. On the other hand, f ( ) M E Y s ncorrectly specfed, the GREG s modelbased and the model mean squared error may contan an mportant bas-squared term. The estmaton error of the GREG Y G s defned as G = s s r r 1 ( ay 1Y ) Y Y N where Y = T N, 1 s = s s s a Π g 1, r Y s the ( N n) -vector of target varables for the nonsample unts, and 1 r s a vector of N n 1 s. Next, suppose that the true model for Y s

6 E var ( Y ) ( Y ) M = x β, (1.4) = ψ M.e., the varance specfcaton s dfferent from (1.1) but ( ) E Y s the same. Usng the M 4 estmaton error, the error-varance of Y G s then where the n n covarance matrx for s ( Y Y) = N ( a Ψ a + 1 Ψ 1 ) var M G s s s r r r Y s =dag( ψ ) Ø s and r Ø s the ( N n) ( N n) covarance matrx for Y r. When the sample and populaton szes are both large and the samplng fracton, f = nn, s neglgble, the error-varance s approxmately ( ) var M Y G Y N a ψ s Note that ths varance depends on the true varance parameters,. (1.5) ψ, and on the workng model varance parameters, v, because v s part of a. Snce a s s approxmately the same as when selecton probabltes are small, the error varance n that case s also approxmately var ( ) g M YG Y N ψ sπ 1 s s Π g (1.6) For model-based varance estmaton, we wll take ether of the asymptotc forms n (1.5) or (1.6) as the target. However, when the samplng fracton s substantal, the term r r r N 1 Ψ 1 can be an mportant part of the error-varance and (1.5) or (1.6) may be poor approxmatons. We wll consder the desgn varance under two sngle-stage plans Bernoull and Posson. In Posson samplng, the ndcators δ for whether a unt s n the sample or not are ndependent wth P( 1) 1 P( 0) δ = = δ = = π (see Särndal, Swensson, and Wretman 199, sec.

7 5 3.5, for a more detaled descrpton). Bernoull samplng s a specal case of Posson samplng n whch each unt has the same ncluson probablty. Under these two plans, the approxmate desgn-varance of Y G s where E Y var = xb and = ( ) 1 π ( Y N ) 1 G N π = 1 E (1.7) π 1 1 B XV X XV Y s the regresson parameter estmator evaluated for the full fnte populaton. Särndal (1996) recommends usng the GREG n conjuncton wth samplng plans for whch (1.7) s vald on the grounds that the varance (1.7) s smple and that the use of regresson estmaton can often more than compensate for the random sample szes that are a consequence of such desgns. The Bernoull and Posson desgns and the lnear models (1.1) and (1.4) serve manly as motvaton for the varance estmators presented n sectons and 3. As noted by Yung and Rao (1996, p.4), t s common practce to use varance estmators that are approprate to a desgn wth ndependent selectons or to a wth-replacement desgn even when a sample has been selected wthout replacement. Lkewse, varance estmators motvated by a lnear model are often appled n cases where departures from the model are antcpated. Ths practcal approach underles the thnkng n ths paper and s llustrated n the smulaton study reported n secton 4.. Varance Estmators Our general goal n varance estmaton wll be to fnd estmators that are consstent and approxmately unbased under both a model and a desgn. Kott (1990) also consdered ths problem. Note that the goal here s not the estmaton of a combned (or antcpated) model-

8 6 desgn varance, ( ) ( ) E E Y Y E E Y Y M π G M π G useful for both var M ( Y G Y) and var π ( Y G). Rather we seek estmators that are. The arguments gven here are largely heurstc ones used to motvate the forms of the varance estmators. Addtonal, formal condtons such as those found n Royall and Cumberland (1978) or Yung and Rao (000) are needed for modelbased and desgn-based consstency and approxmate unbasedness. Frst, consder estmaton of the approxmate model-varance gven n (1.5). In the followng development, we assume that, as N and n become large, () Nmax( ) O( n) π = and () A πs N converges to a matrx of constants, A o. A resdual assocated wth sample unt s r = Y Y where Y =xb. The vector of predcted values for the sample unts can be wrtten as Ys = HY s s (.1) where s = s π s s s s = H XA XV Π. The predcted value for an ndvdual unt s Y hy j j j s 1 where hj πs j ( vjπ j) =xa x s the (j) th element of H s. The matrx H s s the analog to the usual hat matrx (Belsley, Kuh, and Welsch 1980) from standard regresson analyss. The dagonal elements of the hat matrx are known as leverages and are a measure of the effect that a unt has on ts own predcted value. Notce that the nverses of the selecton probabltes are nvolved n (.1), although these would have no role n purely model-based analyss. The followng lemma, whch s a varaton of some results n Lemma of (Vallant, Dorfman, and Royall 000), gves some propertes of the leverages and the hat matrx.

9 7 Lemma 1. Assume that () and () hold. For s = s π s s s s H XA XV Π the followng propertes hold for all s: 1 (a) hj = O( n ) (b) H s s dempotent. (c) 0 h 1. 1 Proof: Snce hj πs j ( vjπ j) 1 =xa x, condtons () and () mply that hj O( n ) =. Part (b) follows from drect multplcaton, usng the defnton of H. To prove (c) note that h 0 s snce t s a quadratc form. Part (b) mples that j j j h = h + hh whch can hold only f h 1. Next, we wrte the resdual as r = Y ( 1 h ) hy where ( ) excludng unt. Snce ( ) 0 M j j j s ( ) E r =, we have E ( r ) var ( r ) ( ) ψ ( 1 ) = and M M s s the sample EM r = h + hjψ j (.) j s ( ) under model (1.4). Usng Lemma 1(a), we have h = o( 1), h o( 1) ( ) E r ψ. Thus, n large samples, M j =, and consequently, r s an approxmately unbased estmator of the correct model-varance even though the varance specfcaton n model (1.1) was ncorrect. As a result,

10 8 r s a robust estmator of the model-varance for unt regardless of the form of ψ. A smple, robust estmator of the approxmate model-varance (1.5) s then ( ) s v Y = N ar (.3) R1 G whch s a type of sandwch estmator (see, e.g., Whte 198). (Note that a formal argument that R1 v s robust would requre condtons such that n 1 E ( v ) M R1 and n 1 N a ψ s converge to the same quantty.) Another varance estmator, smlar to v R1 f ( ) g R G s π a 1 s Π s gs v Y = N r. (.4) An estmator of the approxmate desgn-varance n (1.7) s 1 π ( G) s v π Y = N π r. (.5) An alternatve suggested by Särndal, Swensson, and Wretman (1989) as havng better condtonal propertes s 1 π ( ) s SSW G π v Y = N g r. (.6) Another, smlar estmator, used n the SUPERCARP software (Hdroglou, Fuller, and Hckman 1980) and derved usng Taylor seres methods, s ( ) n gr 1 gr T G = 1 s π s. (.7) π v Y N n n As shown n the Appendx, the second term n parentheses n (.7) converges n probablty to zero under model (1.1). Thus, vt vr n large samples., s

11 9 When the selecton probablty of each unt s small, v SSW wll be smlar to v R1, v R, and v T. All three wll be approxmately model-unbased under (1.4) and approxmately desgnunbased under Bernoull and Posson samplng. On the other hand, v π s approxmately desgnunbased but gnores the g coeffcents and s based under ether model (1.1) or (1.4). As a smple example, consder Bernoull samplng wth π = nn and the workng model ( ) =, ( Y ) σ E Y x β M var M x =. Then the GREG s the rato estmator Y G = Ys x xs where x s a fnte populaton mean. The approxmate model varance under the more general specfcaton, ( Y ) = ψ, s ( n) ( x x ) var M N desgn-varance s ( 1 f ) ( nn) ( Y xy x) estmator ( ) ( ) R s s s s ψ s s where s s ψ = ψ n. The approxmate = 1 where Y s a fnte populaton mean. The p v = n x x Y xy x s approxmately unbased for the modelvarance and, because x xs 1 n large Bernoull samples, v R s also approxmately unbased for the desgn-varance as long as f s small. In contrast, vπ = n f Y xy x s approxmately desgn-unbased but s model-unbased ( ) ( ) 1 s s s only n balanced samples where x = x. Royall and Cumberland (1981) noted smlar results for s the rato estmator n smple random samplng wthout replacement. 3. Alternatve Varance Estmators Usng Adjusted Squared Resduals The frst alternatve varance estmator we consder s the jackknfe. The partcular verson to be studed s defned as

12 10 n n 1 v J = YG ( ) Y G n ( ) (3.1) = 1 where Y G ( ) has the same form as the full sample estmator after omttng sample unt. If the selecton probablty has the form π = np, then (3.1) can be rewrtten. Usng the conventon that the subscrpt () means that sample unt has been omtted, we have Y G ( ) = T G ( ) N, Y ( ) G Y G ( ) n =, T G ( ) = T ( ) ( ) x π + x ( ) s ( ) = j π j( ), T ( ) = n j π j( n 1 ) T π n Y n 1 j s ( ) πs s s s s ( ) B T T, x x, and j s ( ) 1 1 s s B( ) = A X ( ) ( ) V ( ) Π Y ( ) ( ) wth Aπs ( ) = X s ( ) V ( ) Π ( ) Xs ( ). Another more conservatve, but asymptotcally equvalent, verson of the jackknfe replaces Y G( ) wth the full sample estmator Y G. Desgn-based propertes of the jackknfe n (3.1) are usually studed n samples selected wth replacement (see, e.g., Krewsk and Rao 1981, Rao and Wu 1985, Yung and Rao 1996), but appled n practce to wthout-replacement desgns. Note that for the lnear estmator π s 1 Y = N Y π n probablty proportonal to sze wthoutreplacement samplng, nether the jackknfe, v J, nor the approxmatons to v J gven later n ths secton, reduce to the usual Horvtz-Thompson or Yates-Grundy varance estmators. Wth some effort we can wrte the jackknfe n a form that nvolves the resduals and the leverages. The rewrtten form wll make clear the relatonshp of the jackknfe to the varance estmators n secton. Frst, note the followng equaltes that are easly verfed: n Y Tπ( ) = Tπ 1 ( n ), n x Tx π ( ) = T x ( ) (3.) n 1 π

13 s s X s ( ) V Π Y ( ) ( ) s ( ) = XV s s Π s Ys xy vπ, Aπs ( ) = Aπs xx vπ (3.3) Usng a standard formula for the nverse of the sum of two matrces, the slope estmator, omttng sample unt, equals 1 1 s r B π ( ) = B + n A x. s 1 h vπ Detals of ths and the succeedng computatons are sketched n the Appendx. After a consderable amount of algebra, we have n n TG ( ) TG( ) = ( D Ds) + F n 1 n 1 where D = π gr ( 1 h ) and F s defned n the Appendx. The jackknfe n (3.1) s then equal to ( ) n = ( ) + s s s ( ) vj YG N D Ds F F D D s n 1. (3.4) Expresson (3.4) s an exact equalty and could be used as a computatonal formula for the jackknfe. Ths would sdestep the need to mechancally delete a unt, compute Y G ( ), and so on, through the entre sample. In large samples the frst term n brackets n (3.4) s domnant whle the second and thrd are near zero under some reasonable condtons. Thus, n large samples the jackknfe s approxmated by v ( Y ) N ( D D ) J G s s v, or, equvalently, 1 gr 1 gr ( Y ) π ( 1 h ) π ( 1 h ). (3.5) J G s s N N n

14 1 As shown n the Appendx, the second term n (3.5) converges n probablty to zero under model (1.1). Consequently, a further approxmaton to the jackknfe s v 1 g r ( Y ) N π ( 1 h ). (3.6) J G s As (3.5) and (3.6) show, the jackknfe mplctly ncorporates the g coeffcents needed for estmatng the model-varance. The rght-hand sde of (3.6) s tself an alternatve estmator that we wll denote by v J ( YG ). Yung and Rao (1996) also derved an approxmaton to the jackknfe for the GREG n multstage samplng. For sngle-stage samplng, ther approxmaton s equal to v T, defned n (.7), whch s the same as (3.5) f the leverages are zero. Duchesne (000) also presented a formula for the jackknfe, whch he denoted as V JK, that nvolved sample leverages. The advantage of (3.4) s that t makes clear whch parts of the jackknfe are neglgble n large samples. Duchesne also presented an estmator, denoted by VJK, that s essentally the same as v R and s an approxmaton to the jackknfe. Expressons (3.5) and (3.6) explctly show how the leverages affect the sze of the jackknfe. Weghted leverages, h, that are not near zero wll nflate v J. Dependng on the confguraton of the x s, ths could be a substantal effect on some samples. Snce h approaches zero wth ncreasng sample sze, J v, v R, v SSW, and v T have the same asymptotc propertes. In partcular, the jackknfe s approxmately unbased wth respect to ether the model or the desgn and s robust to msspecfcaton of the varances n model (1.1). However, the factor ( h ) 1 n (3.6) s less than or equal to 1 and wll make the jackknfe larger

15 13 than the other varance estmators. Ths wll typcally result n confdence ntervals based on the jackknfe coverng at a hgher rate than ones usng v R, v SSW, or v T. Note, also, that f a wthout-replacement sample s used, and some frst-order or secondorder selecton probabltes are not small, the choces, v R, v D, v J, and v J wll be overestmates of ether the desgn-varance or the model-varance. To account for non-neglgble selecton probabltes, we can make some smple adjustments. An adjusted verson of v J ( YG ) patterned after v SSW, s v ( Y ) ( 1 π) g r π ( 1 h ) 1 JP G = s N., Ths expresson s smlar to VJK 3 of Duchesne (000), although JK 3 V omts the leverages. Expresson (3.6) also suggests another alternatve that s closely related to an estmator of the error varance of the best lnear unbased predctor of the mean under model (1.1) (see, Vallant, Dorfman, and Royall 000, ch.5). Ths estmator s somewhat less conservatve than (3.6), but stll adjusts usng the leverages: Because h o( 1) v ( ) 1 g r Y = N π ( 1 h ) D G s. =, v D s also approxmately model and desgn-unbased. A varant of ths that may perform better when some selecton probabltes are large s v ( Y ) = 1 DP G s N ( 1 π) g r π ( 1 h ).

16 14 4. Smulaton Results To check the performance of the varance estmators, we conducted several smulaton studes usng three dfferent populatons. The frst s the Hosptals populaton lsted n Vallant, Dorfman, and Royall (000, App. B). The second populaton s the Labor Force populaton descrbed n Vallant (1993). The thrd s a modfcaton of the Labor Force populaton. In all three populatons, samplng s done wthout replacement, as descrbed below. These samplng plans wll test the noton that varance estmators motvated, n part, by wth-replacement desgns can stll be useful when appled to wthout-replacement desgns. The Hosptals populaton has N = 393 and a sngle auxlary value x, whch s the number of npatent beds n each hosptal. The Y varable s the number patents dscharged durng a partcular tme perod. The GREG estmator for ths populaton s based on the model M 1 ( ) = β + β, ( ) σ E Y x x 1 var M Y = x. Samples of sze 50 and 100 were selected usng smple random samplng wthout replacement (srswor) and probablty proportonal to sze (pps) wthout replacement wth the sze beng the square root of x. For each combnaton of selecton method and sample sze, 3000 samples were selected. The estmators Y G, v π, v R1, v R, v SSW, v D, v DP, vj, v JP, and v J were calculated for each sample. For comparson we also ncluded the π -estmator, Y T π = π N. The varance estmator v T was ncluded but s not reported here snce results were lttle dfferent from v R. The Labor Force populaton contans 10,841 persons. The auxlary varables used were age, sex, and number of hours worked per week. The Y varable was total weekly wages. Age was grouped nto four categores: 19 years and under, 0-4, 5-34, and 35 or more. The model for the GREG ncluded an ntercept, man effects for age and sex, and the quanttatve varable,

17 15 hours worked. A constant model-varance was used. Samples of sze 50, 100, and 50 were selected. The two selecton methods used were srswor and samplng wthout replacement wth probablty proportonal to hours worked. (Ths populaton has some clusterng but ths was gnored n these smulatons.) The thrd populaton was a verson of Labor Force desgned to nject some outlers or skewness nto the weekly wages varable. We denote ths new verson as LF(mod) for reference. In the orgnal Labor Force populaton, weekly wages were top-coded at $999. For each such top-coded wage, a new wage was generated equal to $1000 plus a lognormal random varable whose dstrbuton had scale and shape parameters of 6.9 and 1. Recoded wages were generated for 4.4% of the populaton. Pror to recodng, the annualzed mean wage was $19,359, and the maxmum was $51,948; after recodng, the mean was $3,103 and the maxmum was $608,116. Thus, LF(mod) exhbts more of the skewness n ncome that would be found n a real populaton. The resultng LF(mod) dstrbuton s shown n Fgure 1 where weekly wages s plotted aganst hours worked for subgroups defned by age. In each panel the black ponts are for males whle the open crcles are for females. A horzontal reference lne s drawn n each panel at $999. Although there s a consderable amount of over-plottng, the general features are clear. Wage levels and spread go up as age ncreases, hours worked per week s related, though somewhat weakly, to wages, and wages are most skewed for age groups 5-34 and 35+. Less evdent s the fact that wages for males are generally hgher than ones for females. Table 1 shows the emprcal percentage relatve bases, defned as the average over the samples of ( ) T T T for the π -estmator and general regresson estmator for the varous populatons and sample szes. Root mean square errors (rmse s), defned as the square root of

18 16 the average over the samples of ( ) T T, are also shown. In the Hosptals populaton, both estmators have neglgble bas at ether sample sze. The GREG s consderably more effcent n Hosptals than the π -estmator because of a strong relatonshp of Y to x. In the two Labor Force populatons, both the π -estmator and the GREG are nearly unbased whle the GREG s somewhat more effcent as measured by the rmse for all sample szes and selecton methods. Table lsts the emprcal relatve bases (relbases) of the nne varance estmators, defned as 100( v mse) mse, where v s the average of a varance estmator over the 3000 samples and mse s the emprcal mean square error of the GREG. The rows of the table are sorted by the sze of the relbas n LF(mod) for srswor s of sze 50, although the orderng would be smlar for the other populatons, sample szes, and selecton methods. In the Hosptals populaton, the samplng fracton s substantal, especally when n = 100. As mght be expected, ths results n the estmators that omt any type of fnte populaton correcton (fpc) v R, v D, vj, and v J beng severe over-estmates n ether srswor or pps samples. Because v R1 lacks a term to reflect the model-varance of the nonsample sum, t under-estmates the mse badly when the samplng fracton s large. In the Labor Force and LF(mod) populatons, ncreasng sample sze leads to decreasng bas. The estmators v π, R1 v, v R, and v SSW have negatve bases that tend to be less severe as the sample sze ncreases. The jackknfe v J and ts varants, v J, v JP, are over-estmates, especally at n = 50. The estmators, v D and v DP, are more nearly unbased at each of the sample szes than most of the other estmators. The emprcal coverages of 95% confdence ntervals across the 3000 samples n each set are shown n Table 3 for the Hosptals populaton. The three choces of varance estmator that

19 17 use the leverage adjustments but not fpc s v D, vj, and v J are larger and, thus, have hgher coverage rates than v π, v R, and v SSW. The tendency of the jackknfe to be larger than other varance estmates for the GREG has also been noted by Stukel, Hdroglou, and Särndal (1996). Ths s an advantage for the smaller sample sze, n= 50. When n =100 and the samplng fracton s large, the estmators wth the fpc s v π, v SSW, v DP, and v JP have closer to the nomnal 95% coverage rates whle v R samples. The estmator v JP choce at ether sample sze or samplng plan., v D, v J, and v J cover n about 97 or 98% of the, that approxmates the jackknfe but ncludes an fpc, s a good Tables 4 and 5 show the coverage rates for the Labor Force and LF(mod) populatons. For the former, v DP, v D, v J, v JP, and v J are clearly better n Labor Force at n = 50 for both srswor and pps samples. But, for n = 50, coverages rates are smlar for all estmators. The purely desgn-based estmator, v π, s unsatsfactory at the smaller sample szes for ether samplng plan. As n Hosptals, v JP gves near nomnal coverage at each sample sze n the Labor Force populaton. The most strkng results n Tables 4 and 5 are for LF(mod) where all varance estmators gve poor coverage. Coverages range from 78.0% for the combnaton ( v π, n = 50, srswor) to 90.7% for ( v J and v J, n = 50, pps). Vrtually all cases of non-coverage are because 1 ( Y G Y) v < 1.96, where v s any of the varance estmators. The poor coverage rates occur even though the π -estmator and GREG are unbased over all samples (see Table 1) and, n the cases of v J, v JP, and vj, the varance estmators are overestmates (see Table ).

20 18 Negatve estmaton errors, Y G Y wth large weekly wages. Fgure s a plot of t-statstcs based on, occur n samples that nclude relatvely few persons v JP,.e., ( G ) Y Y v, versus the number of sample persons wth weekly wages of $1000 or more n sets of 1000 samples for (srswor; n= 50, 100, 50). The negatve estmaton errors n samples wth few persons wth hgh ncomes lead to negatve t-statstcs, and confdence ntervals that mss the populaton mean on the low sde. The problem decreases wth ncreasng sample sze, but the convergence to the nomnal coverage rates s slow and occurs from the bottom up. Regardless of the varance estmator used, coverage wll be less than 95% unless the sample s qute large. We also examned how well the varance estmators perform, condtonal on sample characterstcs. We present only results related to bas of the varance estmators to conserve space. For the Hosptals populaton, we sorted the samples based on D ( x = x x) JP 1 T T, whch s the sum of the dfferences of the π -estmates of the totals of 1 x and x from ther populaton totals. Twenty groups of 150 samples each were then formed. In each group, we computed the bas of Y G along wth the rmse, and the square root of the average of each varance estmator. The results are plotted n Fgure 3 for srswor wth n = 50 and 100 and for pps wth n = 50 and 100. A subset of the varance estmators s plotted. The horzontal axs n each panel gves values of D x. Snce v J, vj, v D, and v R are smlar through most of the range of D x, only the jackknfe v J s plotted. Also, v DP and v JP are close, and only the latter s plotted. The GREG does have a condtonal bas that affects the rmse n off-balance samples. The poor condtonal propertes of v π are most evdent n the smple random samples where the bas of v π as an estmate of the mse runs from negatve to postve over the range of D x. Among the other

21 19 varance estmates, condtonal bases are smlar to the uncondtonal bases n Table. Both v JP and v SSW are n theory approxmately desgn and model-unbased, and both track the rmse well. Fgure 4 s a smlar plot for the samples from the Labor Force populaton. The followng sets of estmates are very smlar and only the frst n each set s ncluded n the plots: (, ) (,, ) vssw vr1 v R, and ( vj, vj, vjp) v v,. Only the srswor and pps samples of sze n = 50 and 50 are D DP ncluded. The horzontal axs s agan D x, whch s the sum of dfferences between the π - estmates and the populaton values of the totals for age and sex groups and the number of hours worked per week. The condtonal bas of v π s evdent n samples wth the smallest values of D x but the problem dmnshes for the larger sample sze n both srswor and pps samples. The jackknfe v J s, on average, the largest of the varance estmators throughout the range of The dfferences among the varance estmates and ther bases are less for the larger sample sze. The estmators vd, vssw, and v J all track the rmse reasonably well except when D x s most negatve, where all are somewhat low. D x. 5. Concluson A varety of estmators of the varance of the general regresson estmator have been proposed n the samplng lterature, manly wth the goal of estmatng the desgn-based varance. Estmators can be easly constructed that are approxmately unbased for both the desgn-varance and, under certan models, the model-varance. Moreover, the dual-purpose estmators studed here are robust estmators of a model-varance even f the model that motvates the GREG has an ncorrect varance parameter.

22 0 A key feature of the best of these estmators s the adjustment of squared resduals by factors analogous to the leverages used n standard regresson analyss. The desrablty of usng leverage correctons to regresson varance estmators n order to combat heteroscedastcty s well-known n econometrcs, havng been proposed by MacKnnon and Whte (1985) and recently revsted by Long and Ervn (000). One of the best choces s an approxmaton to the jackknfe, denoted here by v JP, that ncludes a type of fnte populaton correcton. The robust estmators studed here are qute useful for varables whose dstrbutons are reasonably well behaved. They adjust varance estmators n small and moderate sze samples n a way that often results n better confdence nterval coverage. However, they are no defense when varables are extremely skewed, and large observatons are not well represented n a sample. Whether one refers to ths problem as one of skewness or of outlers, the effect s clear. A sample that does not nclude a suffcent number of unts wth large values wll produce an estmated mean that s too small. A varance estmator that s small often accompanes the small estmated mean. As the smulatons n secton 4 llustrate, n such samples even the best of the proposed varance estmators wll not yeld confdence ntervals that cover at the nomnal rate. The transformaton methods of Chen and Chen (1996) mght hold some promse, but that approach would have to be tested for the more complex GREG estmators studed here. The most effectve soluton to the skewness problem does not appear to be to make better use of the sample data. Rather, the sample tself needs to be desgned to nclude good representaton of the large unts. In many cases, however, lke a survey of households to measure ncome or captal assets, ths may be dffcult or mpossble f auxlary nformaton closely related to the target varable s not avalable. Better use of the sample data employng models for skewed varables may then be useful (see, e.g., Karlberg 000).

23 1 ACKNOWLEDGMENT The author s ndebted to Alan Dorfman whose deas were the mpetus for ths work and to the Assocate Edtor and two referees for ther careful revews. APPENDIX: Detals of Jackknfe Calculatons Usng (3.), (3.3), and the standard matrx result n Lemma of Vallant, et al. (000), we have AπsxxA πs vπ A = πs ( ) A πs +. 1 h From ths and the defnton of 1 B ( ), the slope estmator, omttng unt, s B ( ) = B+ n Q s where Q Aπ x = 1 h s r vπ. The GREG estmator, after deletng unt, s n Y ( ) ( n T ) G T x = π + B Q Tx T x. n 1 π n 1 π After some rearrangement, ths can be rewrtten as n n gr n 1 TG ( ) = T + G + K n 1 n 1 π( 1 h) n 1 n 1 G where G hy Y = π ( 1 h ) and K ( ) nx = B Q T. It follows that π x n n TG ( ) TG( ) = ( D Ds) + F n 1 n 1 where F ( G Gs) n 1 ( K Ks) = + wth G s and Ks beng sample means wth the obvous defntons. Substtutng n the jackknfe formula (3.1) gves

24 ( ) n = ( ) + s s s ( ) vj YG N D Ds F F D D s n 1. (A.1) Formula (A.1) s exact, but wth some further approxmatons we can get the relatve szes of the terms. Usng the values of G and K above and the fact that h and the elements of Q are o ( 1), we have 1 hy Y 1 ( n G ) n K x + = + B Q Tx π( 1 h) n π Y x 1 + B BT x π π n 1 = BT n x where denotes asymptotcally equvalent to. It follows that F 0 and that ( ) ( ) s v Y D D J G s,.e., (3.5) holds. Next, we can show that the second term n (3.5) converges n probablty to zero. The vector of resduals can be expressed as = ( ) N n g sπs U rr s su Πs gs r I H Y, and the second term n (3.5) s equal to s s s Π Π where U = dag( h ), s. Thus, the second term n (3.5) 1 s the square of s s s B= N n g Π U r whch has expectaton zero under any model wth M ( ) 0 E r =. The model-varance of B s ( ) = ( ) ( ) var M g s Π s U r s g s Π s U I H s V s I H s U Π s g s N n Π N n Π Π (A.) whch has order of magntude n under the assumptons we have made. Consequently, the second term n (3.5) s the square of a term wth mean zero and a model-varance that approaches zero as the sample sze ncreases. The second term n (3.5) then converges to zero by Chebyshev s nequalty. Ths justfes (3.6).

25 3 REFERENCES BELSLEY, D.A., KUH, E., AND WELSCH, R.E. (1980). Regresson Dagnostcs. New York: John Wley & Sons. BREWER, K.R.W. (1995). Combnng desgn-based and model-based nference. Chapter 30 n Busness Survey Methods, (Eds. B.G. Cox, D.A. Bnder, B.N. Chnnappa, A. Chrstanson, M.J. Kollege, and P.S. Kott). New York: John Wley, BREWER, K.R.W. (1999). Cosmetc Calbraton wth Unequal Probablty Samplng. Survey Methodology, 5, CHEN, G., AND CHEN, J. (1996) A transformaton method for fnte populaton samplng calbrated wth emprcal lkelhood. Survey Methodology,, DUCHESNE, P. (000). A note on jackknfe varance estmaton for the general regresson estmator. Journal of Offcal Statstcs, 16, HIDIROGLOU, M.A., FULLER, W.A., AND HICKMAN, R.D. (1980). SUPERCARP. Department of Statstcs. Ames, Iowa: Iowa State Unversty. KARLBERG, F. (000). Survey estmaton for hghly skewed populatons n the presence of zeroes. Journal of Offcal Statstcs, 16, KOTT, P.S. (1990). Estmatng the condtonal varance of a desgn consstent regresson estmator. Journal of Statstcal Plannng and Inference, 4, KREWSKI AND RAO, J.N.K. (1981). Inference from stratfed samples: propertes of the lnearzaton, jackknfe, and balanced repeated replcaton methods. Annals of Statstcs, 9,

26 4 LONG, J.S., AND ERVIN, L.H. (000). Usng heteroscedastcty consstent standard errors n the lnear regresson model. The Amercan Statstcan, 54, MACKINNON, J.G., AND WHITE, H. (1985). Some heteroskedastc consstent covarance matrx estmators wth mproved fnte sample propertes. Journal of Econometrcs, 9, RAO, J.N.K. AND WU, C.J.F. (1985). Inference from stratfed samples: second-order analyss of three methods for nonlnear statstcs. Journal of the Amercan Statstcal Assocaton, 80, ROYALL, R.M., and CUMBERLAND, W.G. (1978). Varance estmaton n fnte populaton samplng. Journal of the Amercan Statstcal Assocaton, 73, ROYALL, R.M., and CUMBERLAND, W.G. (1981). An emprcal study of the rato estmator and estmators of ts varance. Journal of the Amercan Statstcal Assocaton, 76, SÄRNDAL, C.-E. (1996). Effcent estmators wth smple varance n unequal probablty samplng. Journal of the Amercan Statstcal Assocaton, 91, SÄRNDAL, C.-E., SWENSSON, B., AND WRETMAN, J. (1989). The weghted resdual technque for estmatng the varance of the general regresson estmator. Bometrka, 76, SÄRNDAL, C.-E., SWENSSON, B., AND WRETMAN, J. (199). Model Asssted Survey Samplng. New York: Sprnger-Verlag. SÄRNDAL, C.-E. AND WRIGHT, R. (1984). Cosmetc form of estmators n survey samplng. Scandanavan Journal of Statstcs, 11,

27 5 STUKEL, D., HIDIROGLOU, M.A., AND SÄRNDAL, C.-E. (1996). Varance estmaton for calbraton estmators: a comparson of jackknfng versus Taylor lnearzaton. Survey Methodology,, SUGDEN, R.A., and SMITH, T.M.F. (1984). Ignorable and nformatve desgns n survey samplng nference. Bometrka, 71, VALLIANT, R. (1993). Poststratfcaton and condtonal varance estmaton. Journal of the Amercan Statstcal Assocaton, 88, VALLIANT, R., DORFMAN, A.H., AND ROYALL, R.M. (000). Fnte Populaton Samplng and Inference: A Predcton Approach. New York: John Wley & Sons. WHITE, H. (198). Maxmum lkelhood estmaton of msspecfed models. Econometrca, 50, 1-5. YUNG, W., AND RAO, J.N.K. (1996). Jackknfe lnearzaton varance estmators under stratfed mult-stage samplng. Survey Methodology,, YUNG, W., AND RAO, J.N.K. (000). Jackknfe varance estmaton under mputaton for estmators usng poststratfcaton nformaton. Journal of the Amercan Statstcal Assocaton, 95,

28 6 Fgure Ttles Fgure 1. Scatterplots of Weekly Wages versus Hours Worked per Week n Four Age Groups for the LF(mod) populaton. Open crcles are for females. Black crcles are for males. A horzontal lne s drawn at $999 per week, the maxmum value n the orgnal Labor Force populaton. Fgure. Plot of t-statstcs versus the number of sample persons wth weekly wages greater than $1000 n the sets of 1000 smple random samples of sze n= 50, 100, 50 from the LF(mod) populaton. Horzontal reference lnes are drawn at ± Ponts are jttered to mnmze overplottng. Fgure 3. Plot of condtonal bases, rmse s, and means of standard error estmates of the GREG for the samples from the Hosptals populaton. Horzontal and vertcal reference lnes are drawn at 0. The lowest curve n each panel s the bas of the GREG. The thck sold lne s the condtonal root mean square error. Fgure 4. Plot of condtonal bases, rmse s, and means of standard error estmates of the GREG for the samples from the Labor Force populaton. Horzontal and vertcal reference lnes are drawn at 0. The lowest curve n each panel s the bas of the GREG. The thck sold lne s the condtonal root mean square error.

29 7 Table 1. Relatve bases and root mean square errors (rmse s) of the π -estmator and the general regresson estmator n dfferent smulaton studes of 3000 samples each. Hosptals Labor Force LF(mod) n= 50 n= 100 n= 50 n= 100 n= 50 n= 50 n= 100 n= 50 Smple random samples Y π Relbas (%) rmse YG Relbas (%) rmse Probablty proportonal to sze samples Y π Relbas (%) rmse YG Relbas (%) rmse

30 8 Table. Relatve bases of nne varance estmators for the general regresson estmator n dfferent smulaton studes of 3000 samples each. Smple random samples Hosptals Labor Force LF(mod) n= 50 n= 100 n= 50 n= 100 n= 50 n= 50 n= 100 n= 50 v π vr v SSW vr vdp vd v J v JP vj Probablty proportonal to sze samples v π vr v SSW vr vdp vd v J v JP vj

31 9 Table 3. 95% confdence nterval coverage rates for smulatons usng the Hosptals populaton and nne varance estmators smple random samples were selected wthout replacement for samples of sze 50 and 100. L s percent of samples wth 1 Y Y v < 1.96; M s percent wth Y Y v 1.96; U s percent wth ( G ) 1 ( Y G Y) 1 v > n= 50 n= 100 L M U L M U Smple random samples v π vr v SSW vr vdp vd v J v JP vj Probablty proportonal to sze samples v π vr v SSW vr vdp vd v J v JP vj G

32 30 Table 4. 95% confdence nterval coverage rates for smulatons usng the Labor Force and LF(mod) populatons and nne varance estmators smple random samples were selected wthout replacement for samples of sze 50, 100, and 50. L s percent of samples wth ( ) 1 1 Y Y v < 1.96; M s percent wth Y Y v 1.96; U s percent wth G 1 ( Y G Y) v > n= 50 n= 100 n= 50 L M U L M U L M U Labor Force v π vr v SSW vr vdp vd v J v JP vj LF(mod) v π vr v SSW vr vdp vd v J v JP vj G

33 31 Table 5. 95% confdence nterval coverage rates for smulatons usng the Labor Force and LF(mod) populatons and nne varance estmators probablty proportonal to sze samples were selected wthout replacement for samples of sze 50, 100, and 50. L s percent of samples wth ( ) 1 1 Y Y v < 1.96; M s percent wth Y Y v 1.96; U s percent wth ( ) 1 Y Y v G G > n= 50 n= 100 n= 50 L M U L M U L M U Labor Force v π vr v SSW vr vdp vd v J v JP vj LF(mod) v π vr v SSW vr vdp vd v J v JP vj G

34 Fgure 1 Age <= 19 Age 0-4 Wages Age 5-34 Age 35+ Wages Hours Hours

35

36 srs n = 50 Fgure 3 pps n = v.p v.r1 vj.star.p v.ssw vj srs n = pps n = 100 v.p v.r1 vj.star.p v.ssw vj v.p v.r1 vj.star.p v.ssw vj v.p v.r1 vj.star.p v.ssw vj Tx.hat - Tx Tx.hat - Tx srs n = 50 Fgure 4 pps n = v.p vd v.ssw vj v.p vd v.ssw vj srs n = 50 pps n = v.p vd v.ssw vj v.p vd v.ssw vj Tx.hat - Tx Tx.hat - Tx

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

Economics 130. Lecture 4 Simple Linear Regression Continued

Economics 130. Lecture 4 Simple Linear Regression Continued Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do

More information

/ n ) are compared. The logic is: if the two

/ n ) are compared. The logic is: if the two STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Chapter 5 Multilevel Models

Chapter 5 Multilevel Models Chapter 5 Multlevel Models 5.1 Cross-sectonal multlevel models 5.1.1 Two-level models 5.1.2 Multple level models 5.1.3 Multple level modelng n other felds 5.2 Longtudnal multlevel models 5.2.1 Two-level

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

A note on regression estimation with unknown population size

A note on regression estimation with unknown population size Statstcs Publcatons Statstcs 6-016 A note on regresson estmaton wth unknown populaton sze Mchael A. Hdroglou Statstcs Canada Jae Kwang Km Iowa State Unversty jkm@astate.edu Chrstan Olver Nambeu Statstcs

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

x i1 =1 for all i (the constant ).

x i1 =1 for all i (the constant ). Chapter 5 The Multple Regresson Model Consder an economc model where the dependent varable s a functon of K explanatory varables. The economc model has the form: y = f ( x,x,..., ) xk Approxmate ths by

More information

2016 Wiley. Study Session 2: Ethical and Professional Standards Application

2016 Wiley. Study Session 2: Ethical and Professional Standards Application 6 Wley Study Sesson : Ethcal and Professonal Standards Applcaton LESSON : CORRECTION ANALYSIS Readng 9: Correlaton and Regresson LOS 9a: Calculate and nterpret a sample covarance and a sample correlaton

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

Population element: 1 2 N. 1.1 Sampling with Replacement: Hansen-Hurwitz Estimator(HH)

Population element: 1 2 N. 1.1 Sampling with Replacement: Hansen-Hurwitz Estimator(HH) Chapter 1 Samplng wth Unequal Probabltes Notaton: Populaton element: 1 2 N varable of nterest Y : y1 y2 y N Let s be a sample of elements drawn by a gven samplng method. In other words, s s a subset of

More information

Efficient nonresponse weighting adjustment using estimated response probability

Efficient nonresponse weighting adjustment using estimated response probability Effcent nonresponse weghtng adjustment usng estmated response probablty Jae Kwang Km Department of Appled Statstcs, Yonse Unversty, Seoul, 120-749, KOREA Key Words: Regresson estmator, Propensty score,

More information

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal

Inner Product. Euclidean Space. Orthonormal Basis. Orthogonal Inner Product Defnton 1 () A Eucldean space s a fnte-dmensonal vector space over the reals R, wth an nner product,. Defnton 2 (Inner Product) An nner product, on a real vector space X s a symmetrc, blnear,

More information

Lecture 3 Stat102, Spring 2007

Lecture 3 Stat102, Spring 2007 Lecture 3 Stat0, Sprng 007 Chapter 3. 3.: Introducton to regresson analyss Lnear regresson as a descrptve technque The least-squares equatons Chapter 3.3 Samplng dstrbuton of b 0, b. Contnued n net lecture

More information

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction

Interval Estimation in the Classical Normal Linear Regression Model. 1. Introduction ECONOMICS 35* -- NOTE 7 ECON 35* -- NOTE 7 Interval Estmaton n the Classcal Normal Lnear Regresson Model Ths note outlnes the basc elements of nterval estmaton n the Classcal Normal Lnear Regresson Model

More information

Properties of Least Squares

Properties of Least Squares Week 3 3.1 Smple Lnear Regresson Model 3. Propertes of Least Squares Estmators Y Y β 1 + β X + u weekly famly expendtures X weekly famly ncome For a gven level of x, the expected level of food expendtures

More information

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise. Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the

More information

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1 Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons

More information

Lecture 6: Introduction to Linear Regression

Lecture 6: Introduction to Linear Regression Lecture 6: Introducton to Lnear Regresson An Manchakul amancha@jhsph.edu 24 Aprl 27 Lnear regresson: man dea Lnear regresson can be used to study an outcome as a lnear functon of a predctor Example: 6

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution

Department of Statistics University of Toronto STA305H1S / 1004 HS Design and Analysis of Experiments Term Test - Winter Solution Department of Statstcs Unversty of Toronto STA35HS / HS Desgn and Analyss of Experments Term Test - Wnter - Soluton February, Last Name: Frst Name: Student Number: Instructons: Tme: hours. Ads: a non-programmable

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

Testing for seasonal unit roots in heterogeneous panels

Testing for seasonal unit roots in heterogeneous panels Testng for seasonal unt roots n heterogeneous panels Jesus Otero * Facultad de Economía Unversdad del Rosaro, Colomba Jeremy Smth Department of Economcs Unversty of arwck Monca Gulett Aston Busness School

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

JAB Chain. Long-tail claims development. ASTIN - September 2005 B.Verdier A. Klinger

JAB Chain. Long-tail claims development. ASTIN - September 2005 B.Verdier A. Klinger JAB Chan Long-tal clams development ASTIN - September 2005 B.Verder A. Klnger Outlne Chan Ladder : comments A frst soluton: Munch Chan Ladder JAB Chan Chan Ladder: Comments Black lne: average pad to ncurred

More information

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y) Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,

More information

Chapter 4: Regression With One Regressor

Chapter 4: Regression With One Regressor Chapter 4: Regresson Wth One Regressor Copyrght 2011 Pearson Addson-Wesley. All rghts reserved. 1-1 Outlne 1. Fttng a lne to data 2. The ordnary least squares (OLS) lne/regresson 3. Measures of ft 4. Populaton

More information

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10)

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10) I. Defnton and Problems Econ7 Appled Econometrcs Topc 9: Heteroskedastcty (Studenmund, Chapter ) We now relax another classcal assumpton. Ths s a problem that arses often wth cross sectons of ndvduals,

More information

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 9: Statistical Inference and the Relationship between Two Variables Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,

More information

Small Area Estimation for Business Surveys

Small Area Estimation for Business Surveys ASA Secton on Survey Research Methods Small Area Estmaton for Busness Surveys Hukum Chandra Southampton Statstcal Scences Research Insttute, Unversty of Southampton Hghfeld, Southampton-SO17 1BJ, U.K.

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi

LOGIT ANALYSIS. A.K. VASISHT Indian Agricultural Statistics Research Institute, Library Avenue, New Delhi LOGIT ANALYSIS A.K. VASISHT Indan Agrcultural Statstcs Research Insttute, Lbrary Avenue, New Delh-0 02 amtvassht@asr.res.n. Introducton In dummy regresson varable models, t s assumed mplctly that the dependent

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

Discussion of Extensions of the Gauss-Markov Theorem to the Case of Stochastic Regression Coefficients Ed Stanek

Discussion of Extensions of the Gauss-Markov Theorem to the Case of Stochastic Regression Coefficients Ed Stanek Dscusson of Extensons of the Gauss-arkov Theorem to the Case of Stochastc Regresson Coeffcents Ed Stanek Introducton Pfeffermann (984 dscusses extensons to the Gauss-arkov Theorem n settngs where regresson

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes

DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR. Introductory Econometrics 1 hour 30 minutes 25/6 Canddates Only January Examnatons 26 Student Number: Desk Number:...... DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR Department Module Code Module Ttle Exam Duraton

More information

Chapter 3 Describing Data Using Numerical Measures

Chapter 3 Describing Data Using Numerical Measures Chapter 3 Student Lecture Notes 3-1 Chapter 3 Descrbng Data Usng Numercal Measures Fall 2006 Fundamentals of Busness Statstcs 1 Chapter Goals To establsh the usefulness of summary measures of data. The

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

January Examinations 2015

January Examinations 2015 24/5 Canddates Only January Examnatons 25 DO NOT OPEN THE QUESTION PAPER UNTIL INSTRUCTED TO DO SO BY THE CHIEF INVIGILATOR STUDENT CANDIDATE NO.. Department Module Code Module Ttle Exam Duraton (n words)

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

Linear Approximation with Regularization and Moving Least Squares

Linear Approximation with Regularization and Moving Least Squares Lnear Approxmaton wth Regularzaton and Movng Least Squares Igor Grešovn May 007 Revson 4.6 (Revson : March 004). 5 4 3 0.5 3 3.5 4 Contents: Lnear Fttng...4. Weghted Least Squares n Functon Approxmaton...

More information

A Comparative Study for Estimation Parameters in Panel Data Model

A Comparative Study for Estimation Parameters in Panel Data Model A Comparatve Study for Estmaton Parameters n Panel Data Model Ahmed H. Youssef and Mohamed R. Abonazel hs paper examnes the panel data models when the regresson coeffcents are fxed random and mxed and

More information

e i is a random error

e i is a random error Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where + β + β e for,..., and are observable varables e s a random error How can an estmaton rule be constructed for the unknown

More information

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 6 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutons to assst canddates preparng for the eamnatons n future years and for

More information

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition) Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes

More information

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore

Predictive Analytics : QM901.1x Prof U Dinesh Kumar, IIMB. All Rights Reserved, Indian Institute of Management Bangalore Sesson Outlne Introducton to classfcaton problems and dscrete choce models. Introducton to Logstcs Regresson. Logstc functon and Logt functon. Maxmum Lkelhood Estmator (MLE) for estmaton of LR parameters.

More information

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics ECOOMICS 35*-A Md-Term Exam -- Fall Term 000 Page of 3 pages QUEE'S UIVERSITY AT KIGSTO Department of Economcs ECOOMICS 35* - Secton A Introductory Econometrcs Fall Term 000 MID-TERM EAM ASWERS MG Abbott

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva Econ 39 - Statstcal Propertes of the OLS estmator Sanjaya DeSlva September, 008 1 Overvew Recall that the true regresson model s Y = β 0 + β 1 X + u (1) Applyng the OLS method to a sample of data, we estmate

More information

An (almost) unbiased estimator for the S-Gini index

An (almost) unbiased estimator for the S-Gini index An (almost unbased estmator for the S-Gn ndex Thomas Demuynck February 25, 2009 Abstract Ths note provdes an unbased estmator for the absolute S-Gn and an almost unbased estmator for the relatve S-Gn for

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information

Factor models with many assets: strong factors, weak factors, and the two-pass procedure

Factor models with many assets: strong factors, weak factors, and the two-pass procedure Factor models wth many assets: strong factors, weak factors, and the two-pass procedure Stanslav Anatolyev 1 Anna Mkusheva 2 1 CERGE-EI and NES 2 MIT December 2017 Stanslav Anatolyev and Anna Mkusheva

More information

The Geometry of Logit and Probit

The Geometry of Logit and Probit The Geometry of Logt and Probt Ths short note s meant as a supplement to Chapters and 3 of Spatal Models of Parlamentary Votng and the notaton and reference to fgures n the text below s to those two chapters.

More information

Limited Dependent Variables and Panel Data. Tibor Hanappi

Limited Dependent Variables and Panel Data. Tibor Hanappi Lmted Dependent Varables and Panel Data Tbor Hanapp 30.06.2010 Lmted Dependent Varables Dscrete: Varables that can take onl a countable number of values Censored/Truncated: Data ponts n some specfc range

More information

A Design Effect Measure for Calibration Weighting in Cluster Samples

A Design Effect Measure for Calibration Weighting in Cluster Samples JSM 04 - Survey Research Methods Secton A Desgn Effect Measure for Calbraton Weghtng n Cluster Samples Kmberly Henry and Rchard Vallant Statstcs of Income, Internal Revenue Servce 77 K Street, E, Washngton,

More information

This column is a continuation of our previous column

This column is a continuation of our previous column Comparson of Goodness of Ft Statstcs for Lnear Regresson, Part II The authors contnue ther dscusson of the correlaton coeffcent n developng a calbraton for quanttatve analyss. Jerome Workman Jr. and Howard

More information

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications Durban Watson for Testng the Lack-of-Ft of Polynomal Regresson Models wthout Replcatons Ruba A. Alyaf, Maha A. Omar, Abdullah A. Al-Shha ralyaf@ksu.edu.sa, maomar@ksu.edu.sa, aalshha@ksu.edu.sa Department

More information

Generalized Linear Methods

Generalized Linear Methods Generalzed Lnear Methods 1 Introducton In the Ensemble Methods the general dea s that usng a combnaton of several weak learner one could make a better learner. More formally, assume that we have a set

More information

Chapter 14 Simple Linear Regression

Chapter 14 Simple Linear Regression Chapter 4 Smple Lnear Regresson Chapter 4 - Smple Lnear Regresson Manageral decsons often are based on the relatonshp between two or more varables. Regresson analss can be used to develop an equaton showng

More information

Multivariate Ratio Estimator of the Population Total under Stratified Random Sampling

Multivariate Ratio Estimator of the Population Total under Stratified Random Sampling Open Journal of Statstcs, 0,, 300-304 ttp://dx.do.org/0.436/ojs.0.3036 Publsed Onlne July 0 (ttp://www.scrp.org/journal/ojs) Multvarate Rato Estmator of te Populaton Total under Stratfed Random Samplng

More information

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation

Statistics for Managers Using Microsoft Excel/SPSS Chapter 13 The Simple Linear Regression Model and Correlation Statstcs for Managers Usng Mcrosoft Excel/SPSS Chapter 13 The Smple Lnear Regresson Model and Correlaton 1999 Prentce-Hall, Inc. Chap. 13-1 Chapter Topcs Types of Regresson Models Determnng the Smple Lnear

More information

Adaptive Estimation of Heteroscedastic Linear Regression Models Using Heteroscedasticity Consistent Covariance Matrix

Adaptive Estimation of Heteroscedastic Linear Regression Models Using Heteroscedasticity Consistent Covariance Matrix ISSN 1684-8403 Journal of Statstcs Volume 16, 009, pp. 8-44 Adaptve Estmaton of Heteroscedastc Lnear Regresson Models Usng Heteroscedastcty Consstent Covarance Matrx Abstract Muhammad Aslam 1 and Gulam

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Exerments-I MODULE III LECTURE - 2 EXPERIMENTAL DESIGN MODELS Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 2 We consder the models

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding Recall: man dea of lnear regresson Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 8 Lnear regresson can be used to study an

More information

Robust Small Area Estimation Using a Mixture Model

Robust Small Area Estimation Using a Mixture Model Robust Small Area Estmaton Usng a Mxture Model Jule Gershunskaya U.S. Bureau of Labor Statstcs Partha Lahr JPSM, Unversty of Maryland, College Park, USA ISI Meetng, Dubln, August 23, 2011 Parameter of

More information

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding

Lecture 9: Linear regression: centering, hypothesis testing, multiple covariates, and confounding Lecture 9: Lnear regresson: centerng, hypothess testng, multple covarates, and confoundng Sandy Eckel seckel@jhsph.edu 6 May 008 Recall: man dea of lnear regresson Lnear regresson can be used to study

More information

9. Binary Dependent Variables

9. Binary Dependent Variables 9. Bnar Dependent Varables 9. Homogeneous models Log, prob models Inference Tax preparers 9.2 Random effects models 9.3 Fxed effects models 9.4 Margnal models and GEE Appendx 9A - Lkelhood calculatons

More information

Now we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity

Now we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity ECON 48 / WH Hong Heteroskedastcty. Consequences of Heteroskedastcty for OLS Assumpton MLR. 5: Homoskedastcty var ( u x ) = σ Now we relax ths assumpton and allow that the error varance depends on the

More information

Small Area Interval Estimation

Small Area Interval Estimation .. Small Area Interval Estmaton Partha Lahr Jont Program n Survey Methodology Unversty of Maryland, College Park (Based on jont work wth Masayo Yoshmor, Former JPSM Vstng PhD Student and Research Fellow

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

CHAPTER 8. Exercise Solutions

CHAPTER 8. Exercise Solutions CHAPTER 8 Exercse Solutons 77 Chapter 8, Exercse Solutons, Prncples of Econometrcs, 3e 78 EXERCISE 8. When = N N N ( x x) ( x x) ( x x) = = = N = = = N N N ( x ) ( ) ( ) ( x x ) x x x x x = = = = Chapter

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

REPLICATION VARIANCE ESTIMATION UNDER TWO-PHASE SAMPLING IN THE PRESENCE OF NON-RESPONSE

REPLICATION VARIANCE ESTIMATION UNDER TWO-PHASE SAMPLING IN THE PRESENCE OF NON-RESPONSE STATISTICA, anno LXXIV, n. 3, 2014 REPLICATION VARIANCE ESTIMATION UNDER TWO-PHASE SAMPLING IN THE PRESENCE OF NON-RESPONSE Muqaddas Javed 1 Natonal College of Busness Admnstraton and Economcs, Lahore,

More information

Difference Equations

Difference Equations Dfference Equatons c Jan Vrbk 1 Bascs Suppose a sequence of numbers, say a 0,a 1,a,a 3,... s defned by a certan general relatonshp between, say, three consecutve values of the sequence, e.g. a + +3a +1

More information

Primer on High-Order Moment Estimators

Primer on High-Order Moment Estimators Prmer on Hgh-Order Moment Estmators Ton M. Whted July 2007 The Errors-n-Varables Model We wll start wth the classcal EIV for one msmeasured regressor. The general case s n Erckson and Whted Econometrc

More information

More metrics on cartesian products

More metrics on cartesian products More metrcs on cartesan products If (X, d ) are metrc spaces for 1 n, then n Secton II4 of the lecture notes we defned three metrcs on X whose underlyng topologes are the product topology The purpose of

More information

A Bound for the Relative Bias of the Design Effect

A Bound for the Relative Bias of the Design Effect A Bound for the Relatve Bas of the Desgn Effect Alberto Padlla Banco de Méxco Abstract Desgn effects are typcally used to compute sample szes or standard errors from complex surveys. In ths paper, we show

More information

Convergence of random processes

Convergence of random processes DS-GA 12 Lecture notes 6 Fall 216 Convergence of random processes 1 Introducton In these notes we study convergence of dscrete random processes. Ths allows to characterze phenomena such as the law of large

More information

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems

Chapter 5. Solution of System of Linear Equations. Module No. 6. Solution of Inconsistent and Ill Conditioned Systems Numercal Analyss by Dr. Anta Pal Assstant Professor Department of Mathematcs Natonal Insttute of Technology Durgapur Durgapur-713209 emal: anta.bue@gmal.com 1 . Chapter 5 Soluton of System of Lnear Equatons

More information

A nonparametric two-sample wald test of equality of variances

A nonparametric two-sample wald test of equality of variances Unversty of Wollongong Research Onlne Centre for Statstcal & Survey Methodology Workng Paper Seres Faculty of Engneerng and Informaton Scences 0 A nonparametrc two-sample wald test of equalty of varances

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

Chapter 6. Supplemental Text Material

Chapter 6. Supplemental Text Material Chapter 6. Supplemental Text Materal S6-. actor Effect Estmates are Least Squares Estmates We have gven heurstc or ntutve explanatons of how the estmates of the factor effects are obtaned n the textboo.

More information

An Introduction to Censoring, Truncation and Sample Selection Problems

An Introduction to Censoring, Truncation and Sample Selection Problems An Introducton to Censorng, Truncaton and Sample Selecton Problems Thomas Crossley SPIDA June 2003 1 A. Introducton A.1 Basc Ideas Most of the statstcal technques we study are for estmatng (populaton)

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3.

Outline. Zero Conditional mean. I. Motivation. 3. Multiple Regression Analysis: Estimation. Read Wooldridge (2013), Chapter 3. Outlne 3. Multple Regresson Analyss: Estmaton I. Motvaton II. Mechancs and Interpretaton of OLS Read Wooldrdge (013), Chapter 3. III. Expected Values of the OLS IV. Varances of the OLS V. The Gauss Markov

More information

Econometrics of Panel Data

Econometrics of Panel Data Econometrcs of Panel Data Jakub Mućk Meetng # 8 Jakub Mućk Econometrcs of Panel Data Meetng # 8 1 / 17 Outlne 1 Heterogenety n the slope coeffcents 2 Seemngly Unrelated Regresson (SUR) 3 Swamy s random

More information

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation

since [1-( 0+ 1x1i+ 2x2 i)] [ 0+ 1x1i+ assumed to be a reasonable approximation Econ 388 R. Butler 204 revsons Lecture 4 Dummy Dependent Varables I. Lnear Probablty Model: the Regresson model wth a dummy varables as the dependent varable assumpton, mplcaton regular multple regresson

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 31 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 6. Rdge regresson The OLSE s the best lnear unbased

More information