Conditional and unconditional models in modelassisted estimation of finite population totals

Size: px
Start display at page:

Download "Conditional and unconditional models in modelassisted estimation of finite population totals"

Transcription

1 Unversty of Wollongong Research Onlne Faculty of Informatcs - Papers Archve) Faculty of Engneerng and Informaton Scences 2011 Condtonal and uncondtonal models n modelasssted estmaton of fnte populaton totals Davd G. Steel Unversty of Wollongong, dsteel@uow.edu.au Robert Graham Clark Unversty of Wollongong, rclark@uow.edu.au Publcaton Detals Steel, D. G. & Clark, R. Graham. 2011). Condtonal and uncondtonal models n model-asssted estmaton of fnte populaton totals. Pakstan Journal of Statstcs, 27 4), Research Onlne s the open access nsttutonal repostory for the Unversty of Wollongong. For further nformaton contact the UOW Lbrary: research-pubs@uow.edu.au

2 Condtonal and uncondtonal models n model-asssted estmaton of fnte populaton totals Abstract The well known Godambe-Josh lower bound for the antcpated varance of desgn unbased estmators of populaton totals treats the auxlary varables as constants. We extend the result to models where these varables are random and show that the generalzed dfference estmator usng the expected values condtonal on all auxlary values s optmal. Ths has several mplcatons ncludng the fact that collectng multple survey varables does not reduce the lower bound. Dscplnes Physcal Scences and Mathematcs Publcaton Detals Steel, D. G. & Clark, R. Graham. 2011). Condtonal and uncondtonal models n model-asssted estmaton of fnte populaton totals. Pakstan Journal of Statstcs, 27 4), Ths journal artcle s avalable at Research Onlne:

3 Pak. J. Statst Vol. 274), CONDITIONAL AND UNCONDITIONAL MODELS IN MODEL-ASSISTED ESTIMATION OF FINITE POPULATION TOTALS Davd G. Steel and Robert G. Clark Centre for Statstcal and Survey Methodology Unversty of Wollongong Australa ABSTRACT The well known Godambe-Josh lower bound for the antcpated varance of desgn unbased estmators of populaton totals treats the auxlary varables as constants. We extend the result to models where these varables are random and show that the generalzed dfference estmator usng the expected values condtonal on all auxlary values s optmal. Ths has several mplcatons ncludng the fact that collectng multple survey varables does not reduce the lower bound. KEY WORDS Cluster samplng; Generalzed regresson estmator; Optmal estmaton; Probablty samplng AMS Classfcaton 62D05 1. INTRODUCTION 1.1 Inferental Frameworks for Survey Samplng Ths artcle extends a well known result relatng to the best possble estmator of a fnte populaton total accordng to a partcular crtera, the antcpated varance. Consder a populaton U consstng of N unts, where unts may be people, households, busnesses or other enttes. A sample s U s selected from U usng some probablty samplng method, wth = P s denotng the probablty of selecton for unt U. A varable of nterest Y s defned for each unt U. The am s to estmate Y = Y usng data on Y for sampled unts s only. We denote the vector of Y for s by Y s. A vector of auxlary varables X may also be avalable and observed for all unts U. We wrte X U and Y U to denote the collecton of all of these varables for all unts n the populaton. For example, the am may be to measure the total employment for workng age adults n a country. In ths case, the unts would be people and the populaton U would be all workng age adults. The varable Y would be defned to equal 1 f person s employed and 0 otherwse. A survey would be admnstered to determne the employment status of all people n a sample s. There s often external nformaton on age and sex for all people n the populaton, so the elements of X would be ndcator varables ndcatng the age and sex of person. The objectve s to combne nformaton on Y for s wth X U to gve the best estmator of Y. c Pakstan Journal of Statstcs 529

4 530 Condtonal and Uncondtonal Models A number of frameworks have been used n sample survey theory, ncludng the desgnbased, model-based and model-asssted frameworks. In the desgn-based framework, X U and Y U are regarded as constants, and the only source of randomness s the selecton of the sample s. Ths framework avods any explct modellng of X U and Y U. The desgn expectaton of an estmator s defned to be ts expected value over all possble samples, wth X U and Y U treated as constants. Estmators of Y are requred to be exactly or approxmately desgn-unbased, that s ther desgn expectaton expectaton over all possble samples) must be exactly or approxmately equal to Y. For a dscusson of the desgnbased framework, see for example Cochran 1977) and Sarndal, Swensson, and Wretman 1992, ch.1-2). Desgn-varances are smlarly defned to be the varances over the possble values of s, wth X U and Y U treated as constants. In the model-based, or predcton framework, a model s postulated for Y U condtonal on X U. Estmators are usually evaluated n terms of the expectaton condtonal on X U and on the partcular sample s selected, so that the only sources of randomness are the populaton values Y U. Expectatons condtonal on s and X U are called model-expectatons; model-varances are defned smlarly. Predctors of Y are generally requred to be modelunbased, n the sense that the model expectaton of the estmator mnus Y should equal zero, and to have low model varance. For a dscusson of the model-based framework, see for example Brewer 1963), Royall 1970) and Vallant, Dorfman, and Royall 2000). In the model-asssted approach to estmatng a fnte populaton total, both probablty samplng methods and populaton models have a role Sarndal et al., 1992, pp.227, ). Both s and Y U are regarded as random, wth a model of some knd beng assumed for Y U. X U are regarded as constants. Estmators are requred to be at least approxmately) desgn unbased, that s unbased over repeated samplng condtonal on Y U and X U. Subject to ths constrant, t s desrable to reduce the varance over both repeated samplng and repeated realsatons of Y U based on an assumed model. The most commonly used estmator n the model-asssted framework s the generalzed regresson estmator. Most estmators of populaton totals used n practce are specal cases of the generalzed regresson estmator. Sarndal et al. 1992) contans an extensve dscusson of ths estmator and the model-asssted framework. 1.2 Optmal Estmaton The exstence of optmal estmators has been nvestgated for all of these frameworks. In the desgn-based framework, an optmal estmator of Y should deally mnmze the desgn-varance n the class of all desgn-unbased estmators. However, t has been shown that there s no such estmator Godambe, 1955). One proof of ths result Basu, 1971) shows that for every populaton, there s n theory a desgn-unbased estmator of a total whch has zero varance for that partcular populaton, but there s no estmator whch has the lowest varance for all populatons. As noted n Basu, 1971), ths result s not usable n practce because t requres perfect knowledge of the populaton values, but t nevertheless provdes a useful benchmark.) The desgn-based framework s therefore not sutable to gve gudance on best estmators wthout restrctng the range of populatons beng consdered. Ths can be done nformally, for example rato estmators have lower desgn varance than the smpler number-rased estmator for populatons satsfyng a smple condton, for smple random samplng Cochran, 1977, p.157).

5 D.G. Steel and R.G. Clark 531 The non-exstence of an optmal desgn-based estmator of Y was one of the motvatons for the development of the model-asssted framework. In ths framework, estmators are stll requred to be exactly or approxmately desgn-unbased. Unlke the desgn-based framework, however, the am s to mnmze the model expectaton of the desgn varance, under an assumed model for Y U. Ths quantty, whch has been called the antcpated varance Isak & Fuller, 1982), can be mnmzed for desgn-unbased estmators Godambe & Josh, 1965). An estmator wth ths optmalty property would be desgn-unbased regardless of the correctness of the populaton model, and would be optmal n the sense of mnmsng the antcpated varance provded the model s true. Consder the followng model for Y U : E Y = µ vary = σ 2 Y,Y j ndependent j Desgn-unbased estmators Ŷ satsfy the nequalty var Ŷ Y 1 ) σ 2 2) where varance s over both repeated samplng and repeated realsatons of Y U Godambe & Josh, 1965). The estmator whch meets the lower bound s Ŷ µ = µ + Y µ ). 3) The estmator Ŷ µ generally cannot be calculated n practce, because the values of µ and ther sum over the populaton would be unknown n almost all cases. The usual strategy n practce s to assume that µ are known functons of X U and some unknown parameters. These parameters can then be estmated from sample data, and estmates ˆµ can then be used n place of µ n 3). Estmators formed n ths way may have varance whch tends to the lower bound 2) as the sample sze and populaton sze both tend to nfnty, gven a correct model for {µ }. For example, the generalzed regresson estmator has ths property Sarndal, 1980). Estmators wth ths property can be called asymptotcally optmal. Ths paper extends result 2) to the case where the auxlary varables X U are random varables, for certan types of desgns. Secton 2 contans a theorem for ths case. The theorem allows the possblty that multple survey varables are collected and also allows a model wth dependences between dfferent values of the survey varable, subject to a restrcton. A dfference estmator smlar to 3) s shown to be optmal, but wth µ replaced by E Y X U. Hence the apparently arbtrary treatment of X U as constants by Godambe and Josh 1965) s reasonable, as the same estmator s optmal under a more general model. Secton 3 gves four mplcatons of ths result and Secton 4 summarses the conclusons. 2. A THEOREM ON THE AV UNDER AN UNCONDITIONAL MODEL Ths secton contans a theorem gvng a lower bound for the antcpated varance AV) of desgn-unbased estmators of Y, and an optmal estmator of Y. The theorem allows for the possblty that other survey varables, Z, are observed for s. 1)

6 532 Condtonal and Uncondtonal Models The theorem requres that the samplng scheme s non-nformatve. A samplng scheme s sad to be non-nformatve f ps) = Ps s selected Y U,Z U,X U = Ps s selected X U,.e. the samplng process depends only on X U, whch s known at the tme of desgn, and not on the survey varables Y U and Z U whch are collected durng the survey. The populaton X U,Y U,Z U wll be assumed to be generated by a model such that E Y X U = µ = µ X U ) vary X U = σ 2 = σ 2 X U) Note that the expectatons n 4) are model-expectatons. Ths model s extremely general because t allows the means and varances of Y condtonal on X U to be any functons of X U. In practce, {µ } would be assumed to be known functons of X U and some unknown parameters. The unknown parameters would need to be estmated from sample data, so that the lower bound n the Theorem could not be perfectly acheved n practce. Theorem 1 Suppose that a varable Y and a number of other varables Z are observed for s. It s assumed that samplng s non-nformatve. The populaton X U,Y U,Z U s assumed to be generated by model 4). It s further assumed that Y,Z ) s condtonally ndependent of Y j,z j ) gven X U for all j such that there s at least one sample s S wth ps) > 0 where s and j s. Let Ŷ = Ŷ s,x U,D s ) be a desgn unbased estmator of Y so that E Ŷ X U,Y U,Z U = Y where D s denotes the sample data Y s,z s ). Defne Ŷ µ by Then Ŷ µ = } 4) Y µ X U )+ µ X U ) 5) var Ŷ Y var Ŷ µ Y. 6) If Y and Y j are condtonally ndependent gven X U for all j, then var Ŷ µ Y XU = 1 ) σ 2 X U ) 7) var Ŷ µ Y 1 ) σ 2 X U). 8)

7 D.G. Steel and R.G. Clark 533 See Appendx for proof. The proof s smlar to Godambe and Josh 1965) except that the condtonng on X U has been explctly stated. Wth X U a random varable, we have obtaned essentally the same result as Godambe and Josh 1965), provded that Ŷ = Ŷ s,x U,D s ). The condton regardng ndependence of Y,Z ) and Y j,z j ) allows for these to be dependent for some j provded that there s never a dependency between values n the sample and values not n the sample. Ths s s a generalzaton of the condton n Godambe and Josh 1965), where Y was assumed to be ndependent of Y j for every j. The generalzaton allows the result to be appled to one-stage cluster samplng where there are dependences between Y,Z ) and Y j,z j ) for and j n the same cluster, provded that observatons from dfferent clusters are ndependent. The result does not cover multstage samplng, because n ths case there may be correlatons between the values of the sampled and unsampled unts from each prmary samplng unt. However, t would cover most other desgns, ncludng stratfed sngle-stage samplng. There s no requrement n Theorem 1 or n Godambe and Josh 1965) for the estmator Ŷ to be lnear. 3. IMPLICATIONS OF THE THEOREM No Value n Modellng X U The values X U may follow some complex model, for example there may be nterestng dependences between the elements of X, or between the elements of X and X j. The theorem shows that Ŷ µ s the optmal estmator regardless of the dstrbuton of X U. The values of E Y X U are requred to calculate Ŷ µ, and generally modellng of Y condtonal on X U s needed to estmate E Y X U. There s no mprovement n the lower bound on the AV of estmators of Y from consderng the margnal dstrbuton. Hence no beneft from knowng ths margnal dstrbuton would be expected, at least for suffcently large samples. It should be noted that the Theorem states a lower bound whch s generally only attaned asymptotcally. In practce, the model for E Y X U must be estmated from sample data, so that for small samples there could be some beneft from consderng the margnal dstrbuton of X U. Multple Survey Varables It s clear from Theorem 1 that consderng several survey varables Y and Z together does not reduce the lower bound for estmates of Y. The same AV can be acheved by modellng one at a tme each varable collected n the survey. The practcal mplcaton s that there s no large sample reducton n the AV f multvarate models for several survey varables condtonal on X U are used. Agan, t should be noted that the lower bound s generally only attaned asymptotcally, so t s possble that multvarate models could gve some beneft for small samples. Ŷ µ s Stll Optmal under Cluster Samplng In one-stage cluster samplng, a sample of clusters of unts s selected, and all unts n these clusters are selected. In ths case, t would be usual to assume that there may be dependences between the values of the survey varables for unts n the same cluster,

8 534 Condtonal and Uncondtonal Models but no dependences across clusters. In ths stuaton, Theorem 1 shows that Ŷ µ s stll the optmal estmator. Hence modellng the dependences wthn the cluster cannot gve any mprovement n estmators of Y, at least for large samples. One practcal example of onestage cluster samplng mght be all people n selected households, where households are selected by telephone samplng. In mult-stage samplng, a sample of frst stage unts e.g. suburbs) s selected, followed by a sample of second stage unts e.g. households) from selected frst stage unts, followed by a sample of thrd stage unts e.g. from selected second stage unts, and so on. Onestage cluster samplng s a smple specal case of mult-stage samplng, but n practce more complex mult-stage samples are often used. Theorem 1 would not generally apply n ths case, because there could be correlatons between sampled and non-sampled unts. In some mult-stage household surveys, the fnal stage of selecton conssts of all people n selected households. In ths case, the correlatons across households may be neglgble compared to the correlatons between people n the same household, so that Theorem 1 may apply at least approxmately. Mukerjee and Sengupta 1989) derved the optmal lnear model-asssted estmator for correlated data, whch could potentally be appled to mult-stage samples n general. Auxlary Data for All Unts Should Be Used The lower bounds n Theorem 1 are acheved by the estmator Ŷ µ n 5) wth µ Y X U whch condtons on the values of the auxlary varables for all unts n the populaton. Ths may be dfferent from E Y X whch s wdely used n motvatng estmators, for example the generalzed regresson estmator Sarndal, 1980) and the nonlnear estmators of Frth and Bennett Frth & Bennett, 1998) and Lehtonen and Vejanen Lehtonen & Vejanen, 1998). In other words, t s usually assumed that the expectaton of Y condtonal on X U does not depend on X j for j. Ths may not be reasonable for populatons wth natural clusterng or a herarchcal structure Steel & Welsh, 2006). Cluster samplng or stratfed cluster samplng) of all people n selected households may be one case where E Y X U E Y X. In some household surveys, X U conssts of demographc data. A person s survey varable could depend on ther own demographc characterstcs and those of other household members, because the latter could ndcate famly or lvng arrangements. Models of ths knd are called contextual models; see for example Kreft and De Leeuw 1998). Consder a regresson estmator based on E Y X, Ŷ A = Y µ )+ µ where µ Y X. Ths s not the same as Ŷ µ, because Ŷ µ uses µ Y X U. If t can be assumed that there are no correlatons across dfferent households, then Theorem 1 means that the AV of Ŷ A must be greater than or equal to the AV of Ŷ µ. To llustrate the possble loss from usng Ŷ A nstead of the optmal estmator, consder the specal case

9 D.G. Steel and R.G. Clark 535 when: are fxed constants not dependng on X U ; X are ndependently and dentcally dstrbuted for U; {Y : U} are ndependent condtonal on X U ; {X,Y ) : U} are dentcally dstrbuted; the sample sze, n, s non-random so that n n =. It s further assumed that σ 2 = vary X U s equal to a constant, V y xu, for all U, and that vary X s equal to a constant V y x for all U. From Theorem 1, var Ŷ µ Y 1 ) σ 2 X U ) The AV of Ŷ A s gven by = V y xu 1 ). var Ŷ A Y var Ŷ A Y s + var E Ŷ A Y s var Ŷ A Y s + 0 var V y x { 1 ) Y µ ) Y µ ) s s 1 ) 2 + N n } = V y x { 1 ) } 2 + N n = V y x { = V y x Hence the neffcency of Ŷ A s 1 ) 1 ) } ) var Ŷ A Y /var Ŷ µ Y = V y x /V y xu. So the rato of the AVs s equal to the rato of the varance of Y condtonal on X to the varance of Y condtonal on X U, whch depends on the predctve power of X U for Y gven X. 4. SUMMARY The Godambe-Josh lower bound, whch defnes optmal model-asssted estmators of total, treats the auxlary varables as known constants. It s applcable when the values of the survey varable are ndependent for dfferent unts. Theorem 1 states a smlar lower bound whch allows the auxlary varables to be random, allows for multple survey varables, and allows for correlatons between values for dfferent unts provded that sample and non-sample values are always ndependent). Ths leads to four new conclusons: Modellng the margnal dstrbuton of the auxlary varables cannot gve any mprovement n the asymptotc AV of desgn-unbased estmators of Y as the sample and populaton sze tend to nfnty.

10 536 Condtonal and Uncondtonal Models If several survey varables are collected for the sample, t does not reduce the lower bound f all are consdered together. Therefore multvarate modellng of several survey varables has no beneft for the asymptotc AV. The optmal estmator s applcable to one-stage cluster samplng even f there are dependences between values of the survey varables for dfferent unts n the same cluster. However the estmator s not optmal for more complex cases of mult-stage samplng whch are often used n practce. Condtonal models for the varable of nterest for a unt should condton on the auxlary data for all populaton unts, not just the auxlary data for that unt. Ths can make a dfference for contextual models n cluster samplng. Acknowledgements Ths work was jontly supported by the Australan Research Councl and the Australan Bureau of Statstcs. REFERENCES 1. Basu, D. 1971). An essay on the logcal foundatons of survey samplng, part 1. In R. Holt & Wnston Eds.), Foundatons of Statstcal Inference pp ). Toronto. 2. Brewer, K. R. W. 1963). Rato estmaton and fnte populatons: some results deductble from the assumpton of an underlyng stochastc process. Australan Journal of Statstcs, 5, Cochran, W. G. 1977). Samplng Technques 3rd ed.). New York: Wley. 4. Frth, D., & Bennett, K. 1998). Robust models n probablty samplng wth dscusson). Journal of the Royal Statstcal Socety Seres B, 601), Godambe, V. 1955). A unfed theory of samplng from fnte populatons. Journal of the Royal Statstcal Socety Seres B, 17, Godambe, V., & Josh, V. 1965). Admssblty and Bayes estmaton n samplng fnte populatons 1. Annals of Mathematcal Statstcs, 36, Isak, C., & Fuller, W. 1982). Survey desgn under the regresson superpopulaton model. Journal of the Amercan Statstcal Assocaton, 77377), Kreft, I., & De Leeuw, J. 1998). Introducng Multlevel Modelng. London: Sage. 9. Lehtonen, R., & Vejanen, A. 1998). Logstc generalzed regresson estmators. Survey Methodology, 241), Mukerjee, R., & Sengupta, S. 1989). Optmal estmators of fnte populaton total under a general correlated model. Bometrka, 764), Royall, R. M. 1970). On fnte populaton samplng theory under certan lnear regresson models. Bometrka, 572), Sarndal, C. 1980). On nverse weghtng versus best lnear unbased weghtng n probablty samplng. Bometrka, 673),

11 D.G. Steel and R.G. Clark Sarndal, C., Swensson, B., & Wretman, J. 1992). Model Asssted Survey Samplng. New York: Sprnger-Verlag. 14. Steel, D., & Welsh, A. 2006). Jont models for clustered data. Preprnt). Unversty of Wollongong School of Mathematcs and Appled Statstcs. 15. Vallant, R., Dorfman, A. H., & Royall, R. M. 2000). Fnte Populaton Samplng and Inference: A Predcton Approach. New York: Wley. APPENDIX: PROOFS Let Ŷ = Y. Ths s the Horvtz-Thompson or nverse probablty estmator whch s exactly desgn unbased e.g. Sarndal et al., 1992). Let S be the set of all possble samples s. Lemma: Under the assumptons stated n Theorem 1, ps)cov Ŷ Y,Ŷ Ŷ s,xu = 0. Proof of Lemma: ps)cov Ŷ Y,Ŷ Ŷ s,xu = ps)e Ŷ Y E Ŷ Y )Ŷ ) s,xu Ŷ s,xu = ps)e = ps)e = ps)e Ŷ ) Y µ ) Y µ )) Ŷ s,x U 1 ) Y µ ) 1 ) ) Y µ X U )) s Ŷ s,ds,x U ) Ŷ s,d s,x U ) ) s,xu Y µ )) Ŷ Ŷ ) s,x U ) ps)e Y µ X U )) s Ŷ s,ds,x U ) Ŷ s,d s,x U ) ) s,x U 9) The second term of 9) s zero because Z,Y ) and Z j,y j ) are ndependent for all s and j s condtonal on X U, so that Y s condtonally ndependent of Ŷ s,ds,x U ) Ŷ s,d s,x U ) )

12 538 Condtonal and Uncondtonal Models for s. So 9) becomes ps)cov Ŷ Y,Ŷ Ŷ s,xu = ps)e 1 ) Y µ X U )) ) Ŷ s,ds,x U ) Ŷ s,d s,x U ) ) s,xu 10) Now, ps) s the condtonal probablty that sample s s selected gven X U. Hence ps)e 1 ) Y µ X U )) ) Ŷ s,ds,x U ) Ŷ s,d s,x U ) ) s,xu 1 ) Y µ X U )) ) Ŷ s,ds,x U ) Ŷ s,d s,x U ) ) X U and so 10) becomes ps)cov Ŷ Y,Ŷ Ŷ s,x U E 1 ) Y µ X U )) ) s Ŷ s,ds,x U ) Ŷ s,d s,x U ) ) XU 1 ) Y µ X U )) ps) Ŷ s,ds,x U ) Ŷ s,d s,x U ) ) X U,Y U,Z U X U 1 ) Y µ X U )) Ŷ s,ds,x U ) Ŷ s,d s,x U ) ) XU ps) 1 ) Y µ X U )) Ŷ s,ds,x U ) Ŷ s,d s,x U ) ) X U 11) where denotes summaton over all samples s whch contan unt. By assumpton, Ŷ s s desgn unbased so ps) Ŷ s,d s,x U ) Ŷ s,d s,x U ) ) = 0

13 D.G. Steel and R.G. Clark 539 and therefore for any U so that for any U. Substtuton nto 11) gves 0 = ps) Ŷ s,d s,x U ) Ŷ s,d s,x U ) ) s + ps) Ŷ s,d s,x U ) Ŷ s,d s,x U ) ) s ps) Ŷ s,d s,x U ) Ŷ s,d s,x U ) ) s = ps) Ŷ s,d s,x U ) Ŷ s,d s,x U ) ) s ps)cov Ŷ Y,Ŷ Ŷ s,xu = E = E = E s E ps) 1 ) Y µ X U )) s s Ŷ s,ds,x U ) Ŷ s,d s,x U ) ) XU ps) 1 ) Y µ X U )) Ŷ s,ds,x U ) Ŷ s,d s,x U ) ) s,x U X U ps) 1 ) E Y µ X U )) Ŷ s,ds,x U ) Ŷ s,d s,x U ) ) s,x U X U. 12) From the assumptons of the theorem, Y and Y j are condtonally ndependent whenever s and j s for all samples s wth ps) > 0. Hence Y s condtonally uncorrelated wth Ŷ s,ds,x U ) Ŷ s,d s,x U ) ) for all samples s wth ps) > 0. Hence the rght hand sde of 12) s zero. Proof of Theorem 1: var Ŷ Y XU var Ŷ Y Ŷ s,xu XU + var E Y s,xu XU E var Ŷ Y s,x U X U = ps)var Ŷ Y s,xu

14 540 Condtonal and Uncondtonal Models = ps)var Ŷ Ŷ +Ŷ Y s,xu = ps) { var Ŷ Ŷ s,xu + var Ŷ Y s,xu +2cov Ŷ Ŷ,Ŷ Y } s,x U ps)var Ŷ Y s,xu +2 ps)cov Ŷ Ŷ,Ŷ Y s,xu 13) The Lemma states that the second term of 13) s zero. Also, Ŷ s equal to Ŷ µ plus terms dependng only on s and X U and hence constant condtonal on s and X U. Hence var Ŷ Y s,x U = var Ŷµ Y s,x U. Makng both of these substtutons nto 13) gves: Inequalty 14) mples that var Ŷ Y X U ps)var Ŷ µ Y s,x U 14) var Ŷ Y var Ŷ Y Ŷ XU + var E Y XU E var Ŷ Y XU E ps)var Ŷ µ Y s,xu. 15) Notce that Ŷ µ s unbased condtonal on s and X U : E Ŷ µ Y s,x U = Y µ )+ µ µ )+ µ Y s,x U µ µ = 0. Hence var Ŷ µ Y var Ŷ µ Y XU,s + var E Ŷ µ Y XU,s var Ŷ µ Y X U,s + var0 var Ŷ µ Y XU,s ps)var Ŷ µ Y s,xu 16) Substtutng 16) nto 15) gves whch s result 6) of the Theorem. var Ŷ Y var Ŷ µ Y

15 D.G. Steel and R.G. Clark 541 If Y and Y j are condtonally ndependent gven X U for all j, then the condtonal varance of Ŷ µ Y s var Ŷ µ Y XU var Y µ )+ µ Y s,x U X U +var0 X U = = var 1 ) Y Y s,x U X U s σ 2 X U 1 ) 2 σ 2 + s 1 ) 2 σ )σ 2 { 1 ) }σ 2 = 1 ) σ 2 whch s result 7) of the Theorem. The uncondtonal varance of Ŷ µ Y s var Ŷ µ Y var Ŷ µ Y XU + var E Ŷµ Y XU 1 ) σ 2 + var0 1 ) σ 2 whch s result 8).

Estimation: Part 2. Chapter GREG estimation

Estimation: Part 2. Chapter GREG estimation Chapter 9 Estmaton: Part 2 9. GREG estmaton In Chapter 8, we have seen that the regresson estmator s an effcent estmator when there s a lnear relatonshp between y and x. In ths chapter, we generalzed the

More information

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction

The Multiple Classical Linear Regression Model (CLRM): Specification and Assumptions. 1. Introduction ECONOMICS 5* -- NOTE (Summary) ECON 5* -- NOTE The Multple Classcal Lnear Regresson Model (CLRM): Specfcaton and Assumptons. Introducton CLRM stands for the Classcal Lnear Regresson Model. The CLRM s also

More information

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X

3.1 Expectation of Functions of Several Random Variables. )' be a k-dimensional discrete or continuous random vector, with joint PMF p (, E X E X1 E X Statstcs 1: Probablty Theory II 37 3 EPECTATION OF SEVERAL RANDOM VARIABLES As n Probablty Theory I, the nterest n most stuatons les not on the actual dstrbuton of a random vector, but rather on a number

More information

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4)

Econ107 Applied Econometrics Topic 3: Classical Model (Studenmund, Chapter 4) I. Classcal Assumptons Econ7 Appled Econometrcs Topc 3: Classcal Model (Studenmund, Chapter 4) We have defned OLS and studed some algebrac propertes of OLS. In ths topc we wll study statstcal propertes

More information

Efficient nonresponse weighting adjustment using estimated response probability

Efficient nonresponse weighting adjustment using estimated response probability Effcent nonresponse weghtng adjustment usng estmated response probablty Jae Kwang Km Department of Appled Statstcs, Yonse Unversty, Seoul, 120-749, KOREA Key Words: Regresson estmator, Propensty score,

More information

Potential gains from using unit level cost information in a model-assisted framework

Potential gains from using unit level cost information in a model-assisted framework Unversty of Wollongong Research Onlne Centre for Statstcal & Survey Methodology Workng Paper Seres Faculty of Engneerng and Informaton Scences 2013 Potental gans from usng unt level cost nformaton n a

More information

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models

Computation of Higher Order Moments from Two Multinomial Overdispersion Likelihood Models Computaton of Hgher Order Moments from Two Multnomal Overdsperson Lkelhood Models BY J. T. NEWCOMER, N. K. NEERCHAL Department of Mathematcs and Statstcs, Unversty of Maryland, Baltmore County, Baltmore,

More information

A note on regression estimation with unknown population size

A note on regression estimation with unknown population size Statstcs Publcatons Statstcs 6-016 A note on regresson estmaton wth unknown populaton sze Mchael A. Hdroglou Statstcs Canada Jae Kwang Km Iowa State Unversty jkm@astate.edu Chrstan Olver Nambeu Statstcs

More information

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U)

ANSWERS. Problem 1. and the moment generating function (mgf) by. defined for any real t. Use this to show that E( U) var( U) Econ 413 Exam 13 H ANSWERS Settet er nndelt 9 deloppgaver, A,B,C, som alle anbefales å telle lkt for å gøre det ltt lettere å stå. Svar er gtt . Unfortunately, there s a prntng error n the hnt of

More information

Chapter 5 Multilevel Models

Chapter 5 Multilevel Models Chapter 5 Multlevel Models 5.1 Cross-sectonal multlevel models 5.1.1 Two-level models 5.1.2 Multple level models 5.1.3 Multple level modelng n other felds 5.2 Longtudnal multlevel models 5.2.1 Two-level

More information

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1

j) = 1 (note sigma notation) ii. Continuous random variable (e.g. Normal distribution) 1. density function: f ( x) 0 and f ( x) dx = 1 Random varables Measure of central tendences and varablty (means and varances) Jont densty functons and ndependence Measures of assocaton (covarance and correlaton) Interestng result Condtonal dstrbutons

More information

Economics 130. Lecture 4 Simple Linear Regression Continued

Economics 130. Lecture 4 Simple Linear Regression Continued Economcs 130 Lecture 4 Contnued Readngs for Week 4 Text, Chapter and 3. We contnue wth addressng our second ssue + add n how we evaluate these relatonshps: Where do we get data to do ths analyss? How do

More information

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands

1. Inference on Regression Parameters a. Finding Mean, s.d and covariance amongst estimates. 2. Confidence Intervals and Working Hotelling Bands Content. Inference on Regresson Parameters a. Fndng Mean, s.d and covarance amongst estmates.. Confdence Intervals and Workng Hotellng Bands 3. Cochran s Theorem 4. General Lnear Testng 5. Measures of

More information

Composite Hypotheses testing

Composite Hypotheses testing Composte ypotheses testng In many hypothess testng problems there are many possble dstrbutons that can occur under each of the hypotheses. The output of the source s a set of parameters (ponts n a parameter

More information

Multivariate Ratio Estimator of the Population Total under Stratified Random Sampling

Multivariate Ratio Estimator of the Population Total under Stratified Random Sampling Open Journal of Statstcs, 0,, 300-304 ttp://dx.do.org/0.436/ojs.0.3036 Publsed Onlne July 0 (ttp://www.scrp.org/journal/ojs) Multvarate Rato Estmator of te Populaton Total under Stratfed Random Samplng

More information

Weighted Estimating Equations with Response Propensities in Terms of Covariates Observed only for Responders

Weighted Estimating Equations with Response Propensities in Terms of Covariates Observed only for Responders Weghted Estmatng Equatons wth Response Propenstes n Terms of Covarates Observed only for Responders Erc V. Slud, U.S. Census Bureau, CSRM Unv. of Maryland, Mathematcs Dept. NISS Mssng Data Workshop, November

More information

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA

4 Analysis of Variance (ANOVA) 5 ANOVA. 5.1 Introduction. 5.2 Fixed Effects ANOVA 4 Analyss of Varance (ANOVA) 5 ANOVA 51 Introducton ANOVA ANOVA s a way to estmate and test the means of multple populatons We wll start wth one-way ANOVA If the populatons ncluded n the study are selected

More information

Comparison of Regression Lines

Comparison of Regression Lines STATGRAPHICS Rev. 9/13/2013 Comparson of Regresson Lnes Summary... 1 Data Input... 3 Analyss Summary... 4 Plot of Ftted Model... 6 Condtonal Sums of Squares... 6 Analyss Optons... 7 Forecasts... 8 Confdence

More information

STAT 3008 Applied Regression Analysis

STAT 3008 Applied Regression Analysis STAT 3008 Appled Regresson Analyss Tutoral : Smple Lnear Regresson LAI Chun He Department of Statstcs, The Chnese Unversty of Hong Kong 1 Model Assumpton To quantfy the relatonshp between two factors,

More information

Lecture 4 Hypothesis Testing

Lecture 4 Hypothesis Testing Lecture 4 Hypothess Testng We may wsh to test pror hypotheses about the coeffcents we estmate. We can use the estmates to test whether the data rejects our hypothess. An example mght be that we wsh to

More information

A Robust Method for Calculating the Correlation Coefficient

A Robust Method for Calculating the Correlation Coefficient A Robust Method for Calculatng the Correlaton Coeffcent E.B. Nven and C. V. Deutsch Relatonshps between prmary and secondary data are frequently quantfed usng the correlaton coeffcent; however, the tradtonal

More information

A general class of estimators for the population mean using multi-phase sampling with the non-respondents

A general class of estimators for the population mean using multi-phase sampling with the non-respondents Hacettepe Journal of Mathematcs Statstcs Volume 43 3 014 511 57 A general class of estmators for the populaton mean usng mult-phase samplng wth the non-respondents Saba Raz Gancarlo Dana Javd Shabbr Abstract

More information

Exponential Type Product Estimator for Finite Population Mean with Information on Auxiliary Attribute

Exponential Type Product Estimator for Finite Population Mean with Information on Auxiliary Attribute Avalable at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 193-9466 Vol. 10, Issue 1 (June 015), pp. 106-113 Applcatons and Appled Mathematcs: An Internatonal Journal (AAM) Exponental Tpe Product Estmator

More information

USE OF DOUBLE SAMPLING SCHEME IN ESTIMATING THE MEAN OF STRATIFIED POPULATION UNDER NON-RESPONSE

USE OF DOUBLE SAMPLING SCHEME IN ESTIMATING THE MEAN OF STRATIFIED POPULATION UNDER NON-RESPONSE STATISTICA, anno LXXV, n. 4, 015 USE OF DOUBLE SAMPLING SCHEME IN ESTIMATING THE MEAN OF STRATIFIED POPULATION UNDER NON-RESPONSE Manoj K. Chaudhary 1 Department of Statstcs, Banaras Hndu Unversty, Varanas,

More information

Uncertainty as the Overlap of Alternate Conditional Distributions

Uncertainty as the Overlap of Alternate Conditional Distributions Uncertanty as the Overlap of Alternate Condtonal Dstrbutons Olena Babak and Clayton V. Deutsch Centre for Computatonal Geostatstcs Department of Cvl & Envronmental Engneerng Unversty of Alberta An mportant

More information

x = , so that calculated

x = , so that calculated Stat 4, secton Sngle Factor ANOVA notes by Tm Plachowsk n chapter 8 we conducted hypothess tests n whch we compared a sngle sample s mean or proporton to some hypotheszed value Chapter 9 expanded ths to

More information

Chapter 11: Simple Linear Regression and Correlation

Chapter 11: Simple Linear Regression and Correlation Chapter 11: Smple Lnear Regresson and Correlaton 11-1 Emprcal Models 11-2 Smple Lnear Regresson 11-3 Propertes of the Least Squares Estmators 11-4 Hypothess Test n Smple Lnear Regresson 11-4.1 Use of t-tests

More information

Maximizing Overlap of Large Primary Sampling Units in Repeated Sampling: A comparison of Ernst s Method with Ohlsson s Method

Maximizing Overlap of Large Primary Sampling Units in Repeated Sampling: A comparison of Ernst s Method with Ohlsson s Method Maxmzng Overlap of Large Prmary Samplng Unts n Repeated Samplng: A comparson of Ernst s Method wth Ohlsson s Method Red Rottach and Padrac Murphy 1 U.S. Census Bureau 4600 Slver Hll Road, Washngton DC

More information

Lecture 3 Stat102, Spring 2007

Lecture 3 Stat102, Spring 2007 Lecture 3 Stat0, Sprng 007 Chapter 3. 3.: Introducton to regresson analyss Lnear regresson as a descrptve technque The least-squares equatons Chapter 3.3 Samplng dstrbuton of b 0, b. Contnued n net lecture

More information

/ n ) are compared. The logic is: if the two

/ n ) are compared. The logic is: if the two STAT C141, Sprng 2005 Lecture 13 Two sample tests One sample tests: examples of goodness of ft tests, where we are testng whether our data supports predctons. Two sample tests: called as tests of ndependence

More information

Chapter 13: Multiple Regression

Chapter 13: Multiple Regression Chapter 13: Multple Regresson 13.1 Developng the multple-regresson Model The general model can be descrbed as: It smplfes for two ndependent varables: The sample ft parameter b 0, b 1, and b are used to

More information

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur

Module 3 LOSSY IMAGE COMPRESSION SYSTEMS. Version 2 ECE IIT, Kharagpur Module 3 LOSSY IMAGE COMPRESSION SYSTEMS Verson ECE IIT, Kharagpur Lesson 6 Theory of Quantzaton Verson ECE IIT, Kharagpur Instructonal Objectves At the end of ths lesson, the students should be able to:

More information

An (almost) unbiased estimator for the S-Gini index

An (almost) unbiased estimator for the S-Gini index An (almost unbased estmator for the S-Gn ndex Thomas Demuynck February 25, 2009 Abstract Ths note provdes an unbased estmator for the absolute S-Gn and an almost unbased estmator for the relatve S-Gn for

More information

Discussion of Extensions of the Gauss-Markov Theorem to the Case of Stochastic Regression Coefficients Ed Stanek

Discussion of Extensions of the Gauss-Markov Theorem to the Case of Stochastic Regression Coefficients Ed Stanek Dscusson of Extensons of the Gauss-arkov Theorem to the Case of Stochastc Regresson Coeffcents Ed Stanek Introducton Pfeffermann (984 dscusses extensons to the Gauss-arkov Theorem n settngs where regresson

More information

REPLICATION VARIANCE ESTIMATION UNDER TWO-PHASE SAMPLING IN THE PRESENCE OF NON-RESPONSE

REPLICATION VARIANCE ESTIMATION UNDER TWO-PHASE SAMPLING IN THE PRESENCE OF NON-RESPONSE STATISTICA, anno LXXIV, n. 3, 2014 REPLICATION VARIANCE ESTIMATION UNDER TWO-PHASE SAMPLING IN THE PRESENCE OF NON-RESPONSE Muqaddas Javed 1 Natonal College of Busness Admnstraton and Economcs, Lahore,

More information

Global Sensitivity. Tuesday 20 th February, 2018

Global Sensitivity. Tuesday 20 th February, 2018 Global Senstvty Tuesday 2 th February, 28 ) Local Senstvty Most senstvty analyses [] are based on local estmates of senstvty, typcally by expandng the response n a Taylor seres about some specfc values

More information

Now we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity

Now we relax this assumption and allow that the error variance depends on the independent variables, i.e., heteroskedasticity ECON 48 / WH Hong Heteroskedastcty. Consequences of Heteroskedastcty for OLS Assumpton MLR. 5: Homoskedastcty var ( u x ) = σ Now we relax ths assumpton and allow that the error varance depends on the

More information

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers

Psychology 282 Lecture #24 Outline Regression Diagnostics: Outliers Psychology 282 Lecture #24 Outlne Regresson Dagnostcs: Outlers In an earler lecture we studed the statstcal assumptons underlyng the regresson model, ncludng the followng ponts: Formal statement of assumptons.

More information

Chapter 9: Statistical Inference and the Relationship between Two Variables

Chapter 9: Statistical Inference and the Relationship between Two Variables Chapter 9: Statstcal Inference and the Relatonshp between Two Varables Key Words The Regresson Model The Sample Regresson Equaton The Pearson Correlaton Coeffcent Learnng Outcomes After studyng ths chapter,

More information

A Comparative Study for Estimation Parameters in Panel Data Model

A Comparative Study for Estimation Parameters in Panel Data Model A Comparatve Study for Estmaton Parameters n Panel Data Model Ahmed H. Youssef and Mohamed R. Abonazel hs paper examnes the panel data models when the regresson coeffcents are fxed random and mxed and

More information

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables

LINEAR REGRESSION ANALYSIS. MODULE VIII Lecture Indicator Variables LINEAR REGRESSION ANALYSIS MODULE VIII Lecture - 7 Indcator Varables Dr. Shalabh Department of Maematcs and Statstcs Indan Insttute of Technology Kanpur Indcator varables versus quanttatve explanatory

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Exerments-I MODULE III LECTURE - 2 EXPERIMENTAL DESIGN MODELS Dr. Shalabh Deartment of Mathematcs and Statstcs Indan Insttute of Technology Kanur 2 We consder the models

More information

Statistics for Economics & Business

Statistics for Economics & Business Statstcs for Economcs & Busness Smple Lnear Regresson Learnng Objectves In ths chapter, you learn: How to use regresson analyss to predct the value of a dependent varable based on an ndependent varable

More information

Non-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT

Non-Mixture Cure Model for Interval Censored Data: Simulation Study ABSTRACT Malaysan Journal of Mathematcal Scences 8(S): 37-44 (2014) Specal Issue: Internatonal Conference on Mathematcal Scences and Statstcs 2013 (ICMSS2013) MALAYSIAN JOURNAL OF MATHEMATICAL SCIENCES Journal

More information

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE

CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE CHAPTER 5 NUMERICAL EVALUATION OF DYNAMIC RESPONSE Analytcal soluton s usually not possble when exctaton vares arbtrarly wth tme or f the system s nonlnear. Such problems can be solved by numercal tmesteppng

More information

Negative Binomial Regression

Negative Binomial Regression STATGRAPHICS Rev. 9/16/2013 Negatve Bnomal Regresson Summary... 1 Data Input... 3 Statstcal Model... 3 Analyss Summary... 4 Analyss Optons... 7 Plot of Ftted Model... 8 Observed Versus Predcted... 10 Predctons...

More information

Nonparametric Regression Estimation. of Finite Population Totals. under Two-Stage Sampling

Nonparametric Regression Estimation. of Finite Population Totals. under Two-Stage Sampling Nonparametrc Regresson Estmaton of Fnte Populaton Totals under Two-Stage Samplng J-Yeon Km Iowa State Unversty F. Jay Bredt Colorado State Unversty Jean D. Opsomer Iowa State Unversty ay 21, 2003 Abstract

More information

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur

Dr. Shalabh Department of Mathematics and Statistics Indian Institute of Technology Kanpur Analyss of Varance and Desgn of Experment-I MODULE VII LECTURE - 3 ANALYSIS OF COVARIANCE Dr Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur Any scentfc experment s performed

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 30 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 2 Remedes for multcollnearty Varous technques have

More information

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests

Simulated Power of the Discrete Cramér-von Mises Goodness-of-Fit Tests Smulated of the Cramér-von Mses Goodness-of-Ft Tests Steele, M., Chaselng, J. and 3 Hurst, C. School of Mathematcal and Physcal Scences, James Cook Unversty, Australan School of Envronmental Studes, Grffth

More information

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6

Department of Quantitative Methods & Information Systems. Time Series and Their Components QMIS 320. Chapter 6 Department of Quanttatve Methods & Informaton Systems Tme Seres and Ther Components QMIS 30 Chapter 6 Fall 00 Dr. Mohammad Zanal These sldes were modfed from ther orgnal source for educatonal purpose only.

More information

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity

LINEAR REGRESSION ANALYSIS. MODULE IX Lecture Multicollinearity LINEAR REGRESSION ANALYSIS MODULE IX Lecture - 31 Multcollnearty Dr. Shalabh Department of Mathematcs and Statstcs Indan Insttute of Technology Kanpur 6. Rdge regresson The OLSE s the best lnear unbased

More information

Basically, if you have a dummy dependent variable you will be estimating a probability.

Basically, if you have a dummy dependent variable you will be estimating a probability. ECON 497: Lecture Notes 13 Page 1 of 1 Metropoltan State Unversty ECON 497: Research and Forecastng Lecture Notes 13 Dummy Dependent Varable Technques Studenmund Chapter 13 Bascally, f you have a dummy

More information

A General Class of Selection Procedures and Modified Murthy Estimator

A General Class of Selection Procedures and Modified Murthy Estimator ISS 684-8403 Journal of Statstcs Volume 4, 007,. 3-9 A General Class of Selecton Procedures and Modfed Murthy Estmator Abdul Bast and Muhammad Qasar Shahbaz Abstract A new selecton rocedure for unequal

More information

Goodness of fit and Wilks theorem

Goodness of fit and Wilks theorem DRAFT 0.0 Glen Cowan 3 June, 2013 Goodness of ft and Wlks theorem Suppose we model data y wth a lkelhood L(µ) that depends on a set of N parameters µ = (µ 1,...,µ N ). Defne the statstc t µ ln L(µ) L(ˆµ),

More information

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix

Lectures - Week 4 Matrix norms, Conditioning, Vector Spaces, Linear Independence, Spanning sets and Basis, Null space and Range of a Matrix Lectures - Week 4 Matrx norms, Condtonng, Vector Spaces, Lnear Independence, Spannng sets and Bass, Null space and Range of a Matrx Matrx Norms Now we turn to assocatng a number to each matrx. We could

More information

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009

College of Computer & Information Science Fall 2009 Northeastern University 20 October 2009 College of Computer & Informaton Scence Fall 2009 Northeastern Unversty 20 October 2009 CS7880: Algorthmc Power Tools Scrbe: Jan Wen and Laura Poplawsk Lecture Outlne: Prmal-dual schema Network Desgn:

More information

Lecture 10 Support Vector Machines II

Lecture 10 Support Vector Machines II Lecture 10 Support Vector Machnes II 22 February 2016 Taylor B. Arnold Yale Statstcs STAT 365/665 1/28 Notes: Problem 3 s posted and due ths upcomng Frday There was an early bug n the fake-test data; fxed

More information

Lecture 12: Classification

Lecture 12: Classification Lecture : Classfcaton g Dscrmnant functons g The optmal Bayes classfer g Quadratc classfers g Eucldean and Mahalanobs metrcs g K Nearest Neghbor Classfers Intellgent Sensor Systems Rcardo Guterrez-Osuna

More information

First Year Examination Department of Statistics, University of Florida

First Year Examination Department of Statistics, University of Florida Frst Year Examnaton Department of Statstcs, Unversty of Florda May 7, 010, 8:00 am - 1:00 noon Instructons: 1. You have four hours to answer questons n ths examnaton.. You must show your work to receve

More information

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours

UNIVERSITY OF TORONTO Faculty of Arts and Science. December 2005 Examinations STA437H1F/STA1005HF. Duration - 3 hours UNIVERSITY OF TORONTO Faculty of Arts and Scence December 005 Examnatons STA47HF/STA005HF Duraton - hours AIDS ALLOWED: (to be suppled by the student) Non-programmable calculator One handwrtten 8.5'' x

More information

Linear Regression Analysis: Terminology and Notation

Linear Regression Analysis: Terminology and Notation ECON 35* -- Secton : Basc Concepts of Regresson Analyss (Page ) Lnear Regresson Analyss: Termnology and Notaton Consder the generc verson of the smple (two-varable) lnear regresson model. It s represented

More information

The Ordinary Least Squares (OLS) Estimator

The Ordinary Least Squares (OLS) Estimator The Ordnary Least Squares (OLS) Estmator 1 Regresson Analyss Regresson Analyss: a statstcal technque for nvestgatng and modelng the relatonshp between varables. Applcatons: Engneerng, the physcal and chemcal

More information

Limited Dependent Variables

Limited Dependent Variables Lmted Dependent Varables. What f the left-hand sde varable s not a contnuous thng spread from mnus nfnty to plus nfnty? That s, gven a model = f (, β, ε, where a. s bounded below at zero, such as wages

More information

Explaining the Stein Paradox

Explaining the Stein Paradox Explanng the Sten Paradox Kwong Hu Yung 1999/06/10 Abstract Ths report offers several ratonale for the Sten paradox. Sectons 1 and defnes the multvarate normal mean estmaton problem and ntroduces Sten

More information

III. Econometric Methodology Regression Analysis

III. Econometric Methodology Regression Analysis Page Econ07 Appled Econometrcs Topc : An Overvew of Regresson Analyss (Studenmund, Chapter ) I. The Nature and Scope of Econometrcs. Lot s of defntons of econometrcs. Nobel Prze Commttee Paul Samuelson,

More information

4.3 Poisson Regression

4.3 Poisson Regression of teratvely reweghted least squares regressons (the IRLS algorthm). We do wthout gvng further detals, but nstead focus on the practcal applcaton. > glm(survval~log(weght)+age, famly="bnomal", data=baby)

More information

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010

Parametric fractional imputation for missing data analysis. Jae Kwang Kim Survey Working Group Seminar March 29, 2010 Parametrc fractonal mputaton for mssng data analyss Jae Kwang Km Survey Workng Group Semnar March 29, 2010 1 Outlne Introducton Proposed method Fractonal mputaton Approxmaton Varance estmaton Multple mputaton

More information

Random Partitions of Samples

Random Partitions of Samples Random Parttons of Samples Klaus Th. Hess Insttut für Mathematsche Stochastk Technsche Unverstät Dresden Abstract In the present paper we construct a decomposton of a sample nto a fnte number of subsamples

More information

A nonparametric two-sample wald test of equality of variances

A nonparametric two-sample wald test of equality of variances Unversty of Wollongong Research Onlne Centre for Statstcal & Survey Methodology Workng Paper Seres Faculty of Engneerng and Informaton Scences 0 A nonparametrc two-sample wald test of equalty of varances

More information

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva

Econ Statistical Properties of the OLS estimator. Sanjaya DeSilva Econ 39 - Statstcal Propertes of the OLS estmator Sanjaya DeSlva September, 008 1 Overvew Recall that the true regresson model s Y = β 0 + β 1 X + u (1) Applyng the OLS method to a sample of data, we estmate

More information

Graph Reconstruction by Permutations

Graph Reconstruction by Permutations Graph Reconstructon by Permutatons Perre Ille and Wllam Kocay* Insttut de Mathémathques de Lumny CNRS UMR 6206 163 avenue de Lumny, Case 907 13288 Marselle Cedex 9, France e-mal: lle@ml.unv-mrs.fr Computer

More information

ANOMALIES OF THE MAGNITUDE OF THE BIAS OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF THE REGRESSION SLOPE

ANOMALIES OF THE MAGNITUDE OF THE BIAS OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF THE REGRESSION SLOPE P a g e ANOMALIES OF THE MAGNITUDE OF THE BIAS OF THE MAXIMUM LIKELIHOOD ESTIMATOR OF THE REGRESSION SLOPE Darmud O Drscoll ¹, Donald E. Ramrez ² ¹ Head of Department of Mathematcs and Computer Studes

More information

Chapter 8 Indicator Variables

Chapter 8 Indicator Variables Chapter 8 Indcator Varables In general, e explanatory varables n any regresson analyss are assumed to be quanttatve n nature. For example, e varables lke temperature, dstance, age etc. are quanttatve n

More information

Introduction to Regression

Introduction to Regression Introducton to Regresson Dr Tom Ilvento Department of Food and Resource Economcs Overvew The last part of the course wll focus on Regresson Analyss Ths s one of the more powerful statstcal technques Provdes

More information

Computing MLE Bias Empirically

Computing MLE Bias Empirically Computng MLE Bas Emprcally Kar Wa Lm Australan atonal Unversty January 3, 27 Abstract Ths note studes the bas arses from the MLE estmate of the rate parameter and the mean parameter of an exponental dstrbuton.

More information

Credit Card Pricing and Impact of Adverse Selection

Credit Card Pricing and Impact of Adverse Selection Credt Card Prcng and Impact of Adverse Selecton Bo Huang and Lyn C. Thomas Unversty of Southampton Contents Background Aucton model of credt card solctaton - Errors n probablty of beng Good - Errors n

More information

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family

Using T.O.M to Estimate Parameter of distributions that have not Single Exponential Family IOSR Journal of Mathematcs IOSR-JM) ISSN: 2278-5728. Volume 3, Issue 3 Sep-Oct. 202), PP 44-48 www.osrjournals.org Usng T.O.M to Estmate Parameter of dstrbutons that have not Sngle Exponental Famly Jubran

More information

Population element: 1 2 N. 1.1 Sampling with Replacement: Hansen-Hurwitz Estimator(HH)

Population element: 1 2 N. 1.1 Sampling with Replacement: Hansen-Hurwitz Estimator(HH) Chapter 1 Samplng wth Unequal Probabltes Notaton: Populaton element: 1 2 N varable of nterest Y : y1 y2 y N Let s be a sample of elements drawn by a gven samplng method. In other words, s s a subset of

More information

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise.

Chapter 2 - The Simple Linear Regression Model S =0. e i is a random error. S β2 β. This is a minimization problem. Solution is a calculus exercise. Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where y + = β + β e for =,..., y and are observable varables e s a random error How can an estmaton rule be constructed for the

More information

e i is a random error

e i is a random error Chapter - The Smple Lnear Regresson Model The lnear regresson equaton s: where + β + β e for,..., and are observable varables e s a random error How can an estmaton rule be constructed for the unknown

More information

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE

THE ROYAL STATISTICAL SOCIETY 2006 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE THE ROYAL STATISTICAL SOCIETY 6 EXAMINATIONS SOLUTIONS HIGHER CERTIFICATE PAPER I STATISTICAL THEORY The Socety provdes these solutons to assst canddates preparng for the eamnatons n future years and for

More information

A Bound for the Relative Bias of the Design Effect

A Bound for the Relative Bias of the Design Effect A Bound for the Relatve Bas of the Desgn Effect Alberto Padlla Banco de Méxco Abstract Desgn effects are typcally used to compute sample szes or standard errors from complex surveys. In ths paper, we show

More information

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics

ECONOMICS 351*-A Mid-Term Exam -- Fall Term 2000 Page 1 of 13 pages. QUEEN'S UNIVERSITY AT KINGSTON Department of Economics ECOOMICS 35*-A Md-Term Exam -- Fall Term 000 Page of 3 pages QUEE'S UIVERSITY AT KIGSTO Department of Economcs ECOOMICS 35* - Secton A Introductory Econometrcs Fall Term 000 MID-TERM EAM ASWERS MG Abbott

More information

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition)

See Book Chapter 11 2 nd Edition (Chapter 10 1 st Edition) Count Data Models See Book Chapter 11 2 nd Edton (Chapter 10 1 st Edton) Count data consst of non-negatve nteger values Examples: number of drver route changes per week, the number of trp departure changes

More information

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10)

Econ107 Applied Econometrics Topic 9: Heteroskedasticity (Studenmund, Chapter 10) I. Defnton and Problems Econ7 Appled Econometrcs Topc 9: Heteroskedastcty (Studenmund, Chapter ) We now relax another classcal assumpton. Ths s a problem that arses often wth cross sectons of ndvduals,

More information

Improvement in Estimating the Population Mean Using Exponential Estimator in Simple Random Sampling

Improvement in Estimating the Population Mean Using Exponential Estimator in Simple Random Sampling Bulletn of Statstcs & Economcs Autumn 009; Volume 3; Number A09; Bull. Stat. Econ. ISSN 0973-70; Copyrght 009 by BSE CESER Improvement n Estmatng the Populaton Mean Usng Eponental Estmator n Smple Random

More information

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper

Games of Threats. Elon Kohlberg Abraham Neyman. Working Paper Games of Threats Elon Kohlberg Abraham Neyman Workng Paper 18-023 Games of Threats Elon Kohlberg Harvard Busness School Abraham Neyman The Hebrew Unversty of Jerusalem Workng Paper 18-023 Copyrght 2017

More information

Small Area Estimation for Business Surveys

Small Area Estimation for Business Surveys ASA Secton on Survey Research Methods Small Area Estmaton for Busness Surveys Hukum Chandra Southampton Statstcal Scences Research Insttute, Unversty of Southampton Hghfeld, Southampton-SO17 1BJ, U.K.

More information

Sample design and estimation for household surveys

Sample design and estimation for household surveys Unversty of Wollongong Thess Collectons Unversty of Wollongong Thess Collecton Unversty of Wollongong Year 00 Sample desgn and estmaton for household surveys Robert Graham Clark Unversty of Wollongong

More information

PHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University

PHYS 450 Spring semester Lecture 02: Dealing with Experimental Uncertainties. Ron Reifenberger Birck Nanotechnology Center Purdue University PHYS 45 Sprng semester 7 Lecture : Dealng wth Expermental Uncertantes Ron Refenberger Brck anotechnology Center Purdue Unversty Lecture Introductory Comments Expermental errors (really expermental uncertantes)

More information

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS Avalable onlne at http://sck.org J. Math. Comput. Sc. 3 (3), No., 6-3 ISSN: 97-537 COMPARISON OF SOME RELIABILITY CHARACTERISTICS BETWEEN REDUNDANT SYSTEMS REQUIRING SUPPORTING UNITS FOR THEIR OPERATIONS

More information

On Outlier Robust Small Area Mean Estimate Based on Prediction of Empirical Distribution Function

On Outlier Robust Small Area Mean Estimate Based on Prediction of Empirical Distribution Function On Outler Robust Small Area Mean Estmate Based on Predcton of Emprcal Dstrbuton Functon Payam Mokhtaran Natonal Insttute of Appled Statstcs Research Australa Unversty of Wollongong Small Area Estmaton

More information

Lecture 3: Probability Distributions

Lecture 3: Probability Distributions Lecture 3: Probablty Dstrbutons Random Varables Let us begn by defnng a sample space as a set of outcomes from an experment. We denote ths by S. A random varable s a functon whch maps outcomes nto the

More information

Lecture 6: Introduction to Linear Regression

Lecture 6: Introduction to Linear Regression Lecture 6: Introducton to Lnear Regresson An Manchakul amancha@jhsph.edu 24 Aprl 27 Lnear regresson: man dea Lnear regresson can be used to study an outcome as a lnear functon of a predctor Example: 6

More information

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications

Durban Watson for Testing the Lack-of-Fit of Polynomial Regression Models without Replications Durban Watson for Testng the Lack-of-Ft of Polynomal Regresson Models wthout Replcatons Ruba A. Alyaf, Maha A. Omar, Abdullah A. Al-Shha ralyaf@ksu.edu.sa, maomar@ksu.edu.sa, aalshha@ksu.edu.sa Department

More information

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y)

Here is the rationale: If X and y have a strong positive relationship to one another, then ( x x) will tend to be positive when ( y y) Secton 1.5 Correlaton In the prevous sectons, we looked at regresson and the value r was a measurement of how much of the varaton n y can be attrbuted to the lnear relatonshp between y and x. In ths secton,

More information

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING

Department of Computer Science Artificial Intelligence Research Laboratory. Iowa State University MACHINE LEARNING MACHINE LEANING Vasant Honavar Bonformatcs and Computatonal Bology rogram Center for Computatonal Intellgence, Learnng, & Dscovery Iowa State Unversty honavar@cs.astate.edu www.cs.astate.edu/~honavar/

More information

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline

Outline. Bayesian Networks: Maximum Likelihood Estimation and Tree Structure Learning. Our Model and Data. Outline Outlne Bayesan Networks: Maxmum Lkelhood Estmaton and Tree Structure Learnng Huzhen Yu janey.yu@cs.helsnk.f Dept. Computer Scence, Unv. of Helsnk Probablstc Models, Sprng, 200 Notces: I corrected a number

More information

Andreas C. Drichoutis Agriculural University of Athens. Abstract

Andreas C. Drichoutis Agriculural University of Athens. Abstract Heteroskedastcty, the sngle crossng property and ordered response models Andreas C. Drchouts Agrculural Unversty of Athens Panagots Lazards Agrculural Unversty of Athens Rodolfo M. Nayga, Jr. Texas AMUnversty

More information