Inference for Two Stage Cluter Sampling: Equal SSU per PSU Projection of SSU andom Variable on Eac SSU election By Ed Stanek Introduction We review etimating equation for PSU mean in a two tage cluter ample were tere are equal number of SSU per PSU In ti development, we project te baic SSU random variable (wit random variable per election onto a ubpace tat a one random variable per election Ti development cloely parallel te development for a two tage ample given in c00ed67doc A Superpopulation Framework for Inference of Linear Function: Two Stage Cluter Sampling Definition of te Population and te Superpopulation Te Population Te finite population i a labeled population of,, primary ampling unit (PSU were eac PSU a te ame number of econdary ampling unit (SSU ( t,, Te non-tocatic value of te unit in te population are repreented by ( y y y y were ( y y y c00ed57doc Created 0/3/00 3:58 P y for,, Tee are population parameter Te Superpopulation We define te uperpopulation to be a vector of random variable tat arie from te two tage random permutation tat are te bai of te two tage ampling Normally, uc
random variable are repreented by a linear combination of te underlying indicator random variable and unit value, uc tat In ti expreion, i Y U U y ij i jt t U i an indicator random variable woe realized value i one if te i t elected PSU i PSU, and zero oterwie Similarly, U jt i an indicator random variable woe realized value i one if te t j elected SSU in PSU i SSU t, and zero oterwie We expand te expreion for te random variable o a to repreent eac product of random variable in te linear combination uniquely Te baic random variable are UU ( i jt t wic we define a y We arrange tee random variable in a vector of dimenion uc tat ( were te individual vector ( were t t t t t are of dimenion and are given by ( y UU UU UU uc tat it t i t i t i t ( y U U U U U U U U U U U U Te t t t t t t t t uperpopulation pan a larger dimenional pace (ie ( ( (wic i a point in pace tan te population 3 Projection of te uperpopulation SSU onto a Single SSU random variable per election We define a projection of SSU random variable onto a ingle SSU random variable per election Te projection matrix i given by P I I I were c00ed57doc Created 0/3/00 3:58 P
I 0 0 0 0 After te projection, te baic random variable are of te 0 0 form U U y i jt t ence t t P t of dimenion were element of t are given by U U y i jt t Tu, t UUt yt UUt yt UUt yt ( UU t y t ( UUt y t U Ut yt 4 Te relationip between te Projected Superpopulation and te Population Tere i a relationip between te uperpopulation and ome of te parameter in te population Te projected random uperpopulation vector can not be projected onto te full c00ed57doc Created 0/3/00 3:58 P 3
et of population parameter It i poible to project onto te vector of PSU mean, µ, owever For example, note tat UiU jt yt U jt yt ince i Ui i Alo, j U jt y t µ Tu, if C I, ten C µ Population Parameter and odel for te Projected Superpopulation Te objective i etimation of parameter tat can be expreed a linear function of te value in te finite population Te function correpond to individual PSU mean We firt define tee parameter in te population Next, we define a model for te projected uperpopulation and relate te projected uperpopulation parameter to te population parameter In addition, we pecify any additional unbiaed contraint and aumption Finally, we develop expreion for te expected value and variance of te projected uperpopulation random variable Linear Function of Population Parameter of Interet Te parameter of interet correpond to te PSU mean given by yt t for,, Tee parameter are linear function of te population µ parameter We repreent tee parameter in a vector, µcy were ( µ µ µ µ and C I We can expre te population parameter a te um of te parameter of interet (te PSU mean plu a reidual, uc tat y X + e µ, were X I and e y X µ c00ed57doc Created 0/3/00 3:58 P 4
A odel for te Projected Superpopulation We decribed a model for te uperpopulation in c00ed49doc a Xβ+ E were X i a matrix of contant, β i a parameter vector Projecting ti model onto te ub-pace reult in te model P PX β+ PE X β+ E Te projected uperpopulation model contain a determinitic and random component Witout additional aumption, tere i no explicit connection between te parameter in te projected uperpopulation β, and te population parameter µ 3 Te elationip between Projected Superpopulation Parameter and Population Parameter We relate te parameter in te model for to parameter in te population Firt, note tat te projected uperpopulation i of dimenion, werea te population y i of dimenion A indicated in ection 4, even toug te pace panned by i of iger dimenion tan tat of y, te vector pace panned by y i not a ub-pace of Te two vector pace do interect, owever, and ti interection enable u to relate µ to β Uing te projection C, C µ Alo, C C Xβ+ CE Te nontocatic portion of te projected uperpopulation model wen projected onto te population parameter of interet (given by te PSU mean i given by C Xβ In te population, Cy µ Alo, te population model i given by y X + e µ We can project c00ed57doc Created 0/3/00 3:58 P 5
tee term on te pace panned by µ uc tat Cy CX µ + Ce We define te uperpopulation model parameter β uc tat C Xβ CX µ Witout lo of generality, we furter require tat βµ Ti implie tat tat C X CX I ecall tat C I Even wit ti aumption, te deign matrix X i not uniquely defined One deign matrix tat will atify ti contraint i given by X I In te two tage uperpopulation model tat wa not projected onto te SSU random variable, (ee c00ed49doc, p4, te deign matrix Xwa alo not unique, and wa et equal to X I Te coice of X I in te uperpopulation model projected on te SSU random variable i compatible wit te coice of X I, ince tee two deign matrice atify te requirement tat PX X (were P I I We ue ti deign matrix in ubequent development Since βµ and C µ, we can expre te uperpopulation parameter a a linear function of te projected uperpopulation vector Tu, βc 3 Expected Value and Variance of te Projected Superpopulation We contruct expreion for te expected value and variance of te uperpopulation uing te ubcript ξ to indicate expectation wit repect to election of PSU and te ubcript ξ to indicate expectation wit repect to election of SSU Expreion for te expected value and variance are developed for a uperpopulation wit unequal number of c00ed57doc Created 0/3/00 3:58 P 6
SSU in c00ed9doc Tee expreion are implified for te etting were tere are equal number of SSU in eac PSU in c00ed6doc Te expreion for te variance i furter implified in c00ed38doc Tee reult are ummarized in c00ed49doc We ue te reult ummarized in c00ed49doc (p5-6 to obtain expreion for Eξξ ( P Eξξ ( var E varξξ P P ξξ 3 Expected Value of te Projected Superpopulation Vector From c00ed49doc (p5, Eξξ Eξξ P µ ξξ y Since P I I, and 3 Variance of te Projected Superpopulation Vector Te variance i given in c00ed49doc (p6 a J I varξξ ( ( D y I I I vξ + Dy I J J ( Dy I I J v ( Dy I ξ were D D, y y D y y 0 0 0 y 0 0 0 y N, J v ξ I and J vξ I Since P P, P I I and varξξ varξξ J I varξξ ( P ( D y I I I vξ + Dy I P J J P ( Dy I I J vξ ( Dy I P c00ed57doc Created 0/3/00 3:58 P 7
We implify ti expreion by evaluating ( y ( y P D I I I D I y I Ten var J I + ξξ y I ξ I I v y I J J y I ξ I J v y I wic i given equivalently by var J I + J J y ξ I J y v ξξ y ξ I I y v Te firt term in tee expreion can alo be implified Firt, note tat J J y I I y y I y Now J y y y I y y y yt µ yt µ Let u define σ ( y t µ Ten J ( σ y I I y Next, note tat J J J y I J y y I y I Alo, J J J y I y y y c00ed57doc Created 0/3/00 3:58 P 8
Now y µ and µ µ J y y µ Ten µ µ J y I D µ µ were D µ µ 0 0 0 µ 0 0 0 µ A a reult, µ µ µ µ J y I J y D D wic µ µ µ µ implifie to J µµ y I J y Dµ We ue tee two implification to expre var ξξ ( A a reult, ( ( v + D ( v J ξξ σ ξ µ ξ var I µµ, were σ ( y µ t, D µ µ 0 0 0 µ 0, 0 0 µ µ µ µ µ, J v ξ I and J vξ I 4 Sampling, e-arranging, and Partitioning We pecify a re-arrangement of te projected uperpopulation vector o tat it can be partitioned into te ampled and remaining portion We aume tat imple random witout c00ed57doc Created 0/3/00 3:58 P 9
replacement ampling i ued to elect i,, I PSU, and from eac elected PSU, imple random witout replacement ampling i ued to elect j,, mssu Te ampled portion of te uperpopulation we conider to be realized and non-tocatic We focu attention on etimation of te remainder of te population We pecify a criteria for etimation, and ten derive etimator (or etimating equation tat optimize ti criteria 4 e-arranging and Partitioning te Projected Superpopulation Te two tage ampling correpond to election of a imple random witout replacement ample of I PSU, and from eac elected PSU electing a imple random witout replacement ample of mssu We re-arrange term in te projected uperpopulation model into a ampled and remainder vector by pre-multiplying by an ( ( permutation matrix K Te matrix i defined a K I II 0 Im 0 I ( I m ( m I II 0 0 I m I ( I ( m m K We define K I 0 I I Im 0 ( I I m ( m I 0 I I 0 I m ( I I ( m m K K, reulting in te model K r X E β+ Te vector X E i of r r r c00ed57doc Created 0/3/00 3:58 P 0
dimenion ( Im, wit term given by t t t of dimenion were t were UUt yt UUt yt U Umt yt ( UIU t y t ( UIUt y t UI Umt yt t i of dimenion Te vector r i of dimenion ( Im Since X I, KX ( m I I KX X r ( I m K X X I I I Im ( I ( m c00ed57doc Created 0/3/00 3:58 P
Note tat ince K i a permutation matrix, KK I We partition te variance of te projected uperpopulation in a imilar manner After re-arranging random variable into te ample and remaining portion, var K var, wic we repreent a r ξξ ξξ var ξξ V Vr Te expreion for r Vr Vr V i given by varξξ V K K Since K I I 0 I I 0 I ( I m and m ( m ( ( v + D ( v J ξξ σ ξ, µ ξ var I µµ I ( σ µµ V K vξ + ξ K K D v J K Ti µ implifie furter to I µµ V v D v J I ( ξ ( ξ σ + m µ were v ξ Jm Im and JI v ξ I I 4 Partitioning te Parameter into function of te Sample And emaining Projected Superpopulation Vector We expre te parameter β a a linear combination of te ampled and remaining uperpopulation vector To do o, recall tat βc Ten, introducing te permutation matrix, c00ed57doc Created 0/3/00 3:58 P
β CK K K CK CK K, L Lr r L + L r r were L CK and Lr CK Since C I, K I I 0 I I 0 I ( I m, and m ( m I II 0 0 I m I ( I ( m m K I 0 I I Im 0 tee expreion implify to ( I I m ( m I 0 I I 0 I m ( I I ( m m ( m ( I ( m ( I and Lr I I I L I, 5 Etimation Our goal i to etimate β baed on te realized ample Since β L + Lr r and i realized, te target of etimation i Lr r Te value of L are not oberved, and r r ence te problem i commonly decribed a prediction of L r r We ue ti terminology 5 Propertie of Predictor We conider predictor r L tat are: linear in te data (of te form L r and c00ed57doc Created 0/3/00 3:58 P 3
unbiaed (uc tat Eξξ L E L r ξξ r r Te unbiaed contraint can be expreed a a contraint on te predictor L r Since t t t of dimenion, we partition te matrix L r a L L L L L L L r were L L L LI m m m L L L ( L L L L wit element L ij i i i im m and Te unbiaed contraint i given by E ξξ L r Lr r 0 or equivalently tat L L Eξξ 0 Since K r r r Lr Lr E ξξ K 0, we can expre te contraint a r Note tat E ( β ξξ, o tat te unbiaed contraint c00ed57doc Created 0/3/00 3:58 P 4
i given by Lr Lr K β 0 Now K I II 0 Im 0 I ( I m ( m β I II 0 0 I m I ( I ( m m β β β β I 0 I I Im 0 ( I I m ( m β I 0 I I 0 I m ( I I ( m m ( ( I m ( m( I Ten te unbiaed contraint i given by β β L L 0 Now ( r r β m ( I β ( m ( I ( m ( I ( m ( I Lr I I I A a reult, β ( β β Lr β m ( I + + β ( m ( I Tu, te unbiaed contraint i given by ( β ( m I m( I ( m ( I [ ] L r β β 0 Now, β β β β Ten ince L L L L L L L, L L L r c00ed57doc Created 0/3/00 3:58 P 5
( β L L L β L L L L L L β L L L β β L L L L L L r L Ten, te unbiaed contraint can be expreed a L L L L L L Iβ 0 For ti contraint to old for L L L L L L L all poible value of β, we require L L I L L L We define L r L r L Lr r Ten L r ( I ( L L L Ten te contraint can be expreed a L r ( I L r ( I I Lr ( I I L I L I L I r r r, or We take te vec c00ed57doc Created 0/3/00 3:58 P 6
expanion, reulting in te expreion ( ( I Lr I L r vec ( I Lr ( I wic i Lr L Lr r given equivalently a I ( I vec( I r or I ( I vec( L vec( I 35 Optimization Criteria We conider te bet etimator to be one tat atifie te contraint tat vec vec r I I L I, and minimize te generalized mean quared error (ee Bolfarine and Zack, (99, p7 given by ( β β ˆ ( β β ˆ GSE E Te reulting etimator we denote by Lr L L var r Lr r r + L VL L VL L V L L VL r r r r r r r r r r r L r c00ed57doc Created 0/3/00 3:58 P 7
36 Contructing te Etimating Equation We contruct te etimating equation by expanding te expreion for te GSE in term of column of L uc tat r GSE + L VL L VL L VL r r r r r r r r + + L VL L VL L VL L VL r r r r r r r r r r r, and ten adding on te Lagrangian multiplier to retain te unbiaed contraint Firt, let u define an matrix of Lagrangian multiplier λ ( λ λ λ were λ ( λ λ λ Ten vec unbiaed contraint can be ummarized a λ λ λ λ i an vector of contant Te vec λ I ( I vec( Lr vec ( I Ten we can expre te Lagrangian tat i to be minimized a ( λ, ;,, r r r r r r + r r r L L L VL L VL L VL + + L VL L VL L VL L VL r r r r r r r r r r r + vec vec vec ( λ I ( I ( L ( I r We differentiate ti equation wit repect to L and λ for,,, and ten et r te reulting derivative to zero We make ue of tandard matrix differentiation reult (ee c00ed57doc Created 0/3/00 3:58 P 8
arville, p95 uc tat ( xax x ( A+ A x, ( Ax x Aand ( Ax x A Te derivative L L r wit repect to L for,, will reult in te equation r L L r ( VL r VL r r ( VL r VL r r ( λ + + or L ( VL r VL r r + λ Ti reult ince L r vec λ I ( I vec( Lr vec( I r L Lr Lr vec λ I ( I L r Lr λ λ L Lr r wic implifie to ( λ ( λ To form te derivative L L λ L λ L λ λ wit repect to λ for,,, note tat vec λ I ( I vec( Lr vec( I ( λ Lr vec vec λ I c00ed57doc Created 0/3/00 3:58 P 9
Te firt term in ti expreion implifie to ( λ Lr L r λ ( L L L be expreed a λ λ λ wic can L λ Te econd term in ti expreion implifie to vec vec λ I λ A a reult, vec vec r vec λ λ ( λ I ( I ( L ( I ( L Uing ti expreion L λ L wen and L λ L wen We develop more compact form of te derivative equation Firt, note tat L L L L λ I λ L L λ λ L were I i an column vector correponding to te term L in te firt part of ti expreion i a calar Now, ince t column of I Eac of te c00ed57doc Created 0/3/00 3:58 P 0
L ( (, ten ( L I L L L r L L I L r A a reult, L I Lr I λ Combining tee expreion, te derivative of te Lagrangian reult in te following expreion: L ( VL r VL r r + λ L r for,, and L I Lr I for,, λ Setting tee equation equal to zero, te bet linear unbiaed predictor i te olution to te equation: ( VL r VL r r + ( λ 0 for,, and I L I r for,, We ue a vec expanion to expre tee equation a a ingle ytem of equation Firt, note tat VL r V L r L r L r VL r and imilarly, VL VL r r r r Now vec r ( vec( r VL V L ( V L L L r r r Te firt et of equation can c00ed57doc Created 0/3/00 3:58 P
tu be expreed a ( vec V Lr + λ ( Vr vec( Lr for,, Expreing tee equation imultaneouly for all,,, vec vec λ vec V L + V L r r r We expre te econd term in ti equation in a different manner, noting tat vec( λ ( I vec( λ Ten te firt et of equation expreed imultaneouly for all,, i given by ( J V vec( Lr + ( I vec( λ ( J Vr vec( Lr In a imilar manner, we re-expre te econd et of equation Tee equation are given by r Lr I I I L I Lr I or vec vec I I L I r We ummarize tee etimating equation by expreing tem a one et of equation Te reulting et of equation i given by ( ( ( J Vr vec( Lr J V I vec( r L ( vec I 0 vec λ I 37 Simplification of te Etimating Equation 37 General Solution to te Etimating Equation c00ed57doc Created 0/3/00 3:58 P
A general olution to a et of etimating equation of te form T U A U 0 λ B wa developed in c00ed7doc (p-, reulting in te olution Te focu i on etimation of an expreion equivalent to L r were vec( r L A decribed in c00ed7doc, te olution can be expreed in term of vec Lr Tu, ( vec Lr I WB+ I WU TA were ( W TU UTU 37 Explicit Expreion for te Solution We preent expreion for term needed in te etimating equation to form a olution For ti problem, equivalent notation i given by T ( J V, ( U I, ( r vec( r A J V L B I Alo, from ection 4, and vec I µµ V v D v J I ( ξ ( ξ σ + m µ and JI v ξ I I were v ξ Jm Im c00ed57doc Created 0/3/00 3:58 P 3