Relating Inference and Missing Data by Rubin (1976) to Simple Random Sampling with Response Error Ed Stanek 1. INTRODUCTION

Similar documents
Further Investigation of alternative Formulation of RP Model with Response Error. Ed Stanek

x z Increasing the size of the sample increases the power (reduces the probability of a Type II error) when the significance level remains fixed.

Statistical Inference Procedures

TESTS OF SIGNIFICANCE

JOURNAL OF THE INDIAN SOCIETY OF AGRICULTURAL STATISTICS

8.6 Order-Recursive LS s[n]

SOLUTION: The 95% confidence interval for the population mean µ is x ± t 0.025; 49

LECTURE 13 SIMULTANEOUS EQUATIONS

Optimal Estimator for a Sample Set with Response Error. Ed Stanek

State space systems analysis

Chapter 9. Key Ideas Hypothesis Test (Two Populations)

STA 4032 Final Exam Formula Sheet

Comments on Discussion Sheet 18 and Worksheet 18 ( ) An Introduction to Hypothesis Testing

Inference for Two Stage Cluster Sampling: Equal SSU per PSU. Projections of SSU Random Variables on Each SSU selection.

STRONG DEVIATION THEOREMS FOR THE SEQUENCE OF CONTINUOUS RANDOM VARIABLES AND THE APPROACH OF LAPLACE TRANSFORM

Société de Calcul Mathématique, S. A. Algorithmes et Optimisation

STUDENT S t-distribution AND CONFIDENCE INTERVALS OF THE MEAN ( )

10-716: Advanced Machine Learning Spring Lecture 13: March 5

20. CONFIDENCE INTERVALS FOR THE MEAN, UNKNOWN VARIANCE

ALLOCATING SAMPLE TO STRATA PROPORTIONAL TO AGGREGATE MEASURE OF SIZE WITH BOTH UPPER AND LOWER BOUNDS ON THE NUMBER OF UNITS IN EACH STRATUM

Brief Review of Linear System Theory

VIII. Interval Estimation A. A Few Important Definitions (Including Some Reminders)

Tables and Formulas for Sullivan, Fundamentals of Statistics, 2e Pearson Education, Inc.

Erick L. Oberstar Fall 2001 Project: Sidelobe Canceller & GSC 1. Advanced Digital Signal Processing Sidelobe Canceller (Beam Former)

Applied Mathematical Sciences, Vol. 9, 2015, no. 3, HIKARI Ltd,

Isolated Word Recogniser

Confidence Intervals: Three Views Class 23, Jeremy Orloff and Jonathan Bloom

Statistics and Chemical Measurements: Quantifying Uncertainty. Normal or Gaussian Distribution The Bell Curve

u t u 0 ( 7) Intuitively, the maximum principles can be explained by the following observation. Recall

Generalized Likelihood Functions and Random Measures

11/19/ Chapter 10 Overview. Chapter 10: Two-Sample Inference. + The Big Picture : Inference for Mean Difference Dependent Samples

REVIEW OF SIMPLE LINEAR REGRESSION SIMPLE LINEAR REGRESSION

Analysis of Analytical and Numerical Methods of Epidemic Models

IntroEcono. Discrete RV. Continuous RV s

UNIVERSITY OF CALICUT

M227 Chapter 9 Section 1 Testing Two Parameters: Means, Variances, Proportions

A criterion for easiness of certain SAT-problems

Fig. 1: Streamline coordinates

COMPARISONS INVOLVING TWO SAMPLE MEANS. Two-tail tests have these types of hypotheses: H A : 1 2

Performance-Based Plastic Design (PBPD) Procedure

Explicit scheme. Fully implicit scheme Notes. Fully implicit scheme Notes. Fully implicit scheme Notes. Notes

Heat Equation: Maximum Principles

ELEC 372 LECTURE NOTES, WEEK 4 Dr. Amir G. Aghdam Concordia University

LECTURE 2 LEAST SQUARES CROSS-VALIDATION FOR KERNEL DENSITY ESTIMATION

Generalized Fibonacci Like Sequence Associated with Fibonacci and Lucas Sequences

On the Signed Domination Number of the Cartesian Product of Two Directed Cycles

It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable.

On The Computation Of Weighted Shapley Values For Cooperative TU Games

A Tail Bound For Sums Of Independent Random Variables And Application To The Pareto Distribution

CHAPTER 6. Confidence Intervals. 6.1 (a) y = 1269; s = 145; n = 8. The standard error of the mean is = s n = = 51.3 ng/gm.

Chapter 6 Principles of Data Reduction

EULER-MACLAURIN SUM FORMULA AND ITS GENERALIZATIONS AND APPLICATIONS

Chapter 1 Econometrics

Confidence Intervals. Confidence Intervals

Discrete-Time Systems, LTI Systems, and Discrete-Time Convolution

Chapter 9: Hypothesis Testing

5.1 Review of Singular Value Decomposition (SVD)

Chapter 8: Estimating with Confidence

On the Positive Definite Solutions of the Matrix Equation X S + A * X S A = Q

A tail bound for sums of independent random variables : application to the symmetric Pareto distribution

100(1 α)% confidence interval: ( x z ( sample size needed to construct a 100(1 α)% confidence interval with a margin of error of w:

Statistical Intervals Based on a Single Sample (Devore Chapter Seven)

Chapter 1 ASPECTS OF MUTIVARIATE ANALYSIS

Sequences, Mathematical Induction, and Recursion. CSE 2353 Discrete Computational Structures Spring 2018

Questions about the Assignment. Describing Data: Distributions and Relationships. Measures of Spread Standard Deviation. One Quantitative Variable

ME 410 MECHANICAL ENGINEERING SYSTEMS LABORATORY REGRESSION ANALYSIS

Reasons for Sampling. Forest Sampling. Scales of Measurement. Scales of Measurement. Sampling Error. Sampling - General Approach

Collective Support Recovery for Multi-Design Multi-Response Linear Regression

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

MASSACHUSETTS INSTITUTE OF TECHNOLOGY 6.265/15.070J Fall 2013 Lecture 16 11/04/2013. Ito integral. Properties

Grant MacEwan University STAT 151 Formula Sheet Final Exam Dr. Karen Buro

Statistical treatment of test results

Math 475, Problem Set #12: Answers

On the 2-Domination Number of Complete Grid Graphs

Formula Sheet. December 8, 2011

ON THE SCALE PARAMETER OF EXPONENTIAL DISTRIBUTION

The Random Walk For Dummies

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

Last time: Completed solution to the optimum linear filter in real-time operation

(3) If you replace row i of A by its sum with a multiple of another row, then the determinant is unchanged! Expand across the i th row:

Lecture 30: Frequency Response of Second-Order Systems

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Tools Hypothesis Tests

Topic 9: Sampling Distributions of Estimators

University of Washington Department of Chemistry Chemistry 453 Winter Quarter 2015

Statistical Equations

Optimal Search for Efficient Estimator of Finite Population Mean Using Auxiliary Information

Chapter 8.2. Interval Estimation

Fractional parts and their relations to the values of the Riemann zeta function

Infinite Sequences and Series

Stat 421-SP2012 Interval Estimation Section

Widely used? average out effect Discrete Prior. Examplep. More than one observation. using MVUE (sample mean) yy 1 = 3.2, y 2 =2.2, y 3 =3.6, y 4 =4.

1it is said to be overdamped. When 1, the roots of

EE 508 Lecture 6. Dead Networks Scaling, Normalization and Transformations

Assignment 1 - Solutions. ECSE 420 Parallel Computing Fall November 2, 2014

Chapter 7 COMBINATIONS AND PERMUTATIONS. where we have the specific formula for the binomial coefficients:

Atomic Physics 4. Name: Date: 1. The de Broglie wavelength associated with a car moving with a speed of 20 m s 1 is of the order of. A m.

A New Estimator Using Auxiliary Information in Stratified Adaptive Cluster Sampling

Lecture 2: April 3, 2013

DESIGN BASED PREDICTION IN SIMPLE RANDOM SAMPLING WITH APPLICATION TO RANDOM EFFECTS

Transcription:

ferece ad Miig Data related to SRS wit Repoe Error - Relatig ferece ad Miig Data by Rubi (976) to Simple Radom Samplig wit Repoe Error Ed Staek NRODUON We ave dicovered we developig a BLUP of a realized ubject latet value elected via imple radom amplig i te preece of repoe error varyig by a ubject tat te BLUP i baed o average repoe error owever, a imple example illutrate tat a maller MSE ca be obtaied by lettig repoe error deped o te realized ubject i te rikage cotat u, a predictor tat ue te ubject pecific repoe error outperform te BLUP baed o average repoe error e predictor uig ubject pecific repoe error i give i tadard mixed model olutio to te problem, ad make ue of ome additioal iformatio tat i coditioal o te ubject, amely ubject pecific repoe error At te ame time, tere are ucoditioal apect to te predictio ee iclude te defiitio of bet ad ubiaed, alog wit te iteret i iferece about parameter defied over te etire populatio rater ta te obervatio o te ampled ubject e mixture of coditioal ad ucoditioal apect of te problem alog wit te couter-example illutratig te fiite populatio BLUP i ot optimal provide motivatio for furter tudy ere i a broad literature o iferece i te preece of miig data ivolvig coditioal ad ucoditioal cocept Little ad Rubi (00, d editio) refer to a earlier paper by Rubi (976) i referece to iferece wit miig data i a fiite populatio amplig cotext i ecod editio omit two ectio from te firt editio Little ad Rubi (986) tat dicu radomizatio iferece wit ad witout miig data, but iclude ome dicuio (ee Sectio etitled Weigted complete-cae aalyi i Little ad Rubi (00, d editio) tat dicu urvey o repoe) i cage ift te focu more toward Bayeia ad likeliood metod ere i a cloe coectio betwee fiite populatio mixed model approace uig idicator radom variable for ampled ubject, ad miig data metod tat ue idicator radom variable for miig data Wile imilar, tee metod ave ot bee formally coected Differece occur i otatio e juxtapoitio of idea i ot clearly defied We addre tee iue i ti paper, wit a ultimate purpoe of better udertadig te mixed model predictio problem wit repoe error We firt review te cotext ad otatio for te fiite populatio mixed model wit repoe error We ubequetly review te ettig ad otatio developed by Little ad Rubi, uig a a primary ource ectio of Little ad Rubi (00) FNE POPULAON MXED MODELS W RESPONSE ERROR e model for Subject wit Repoe Error i te Populatio 07ed58doc /9/007 :54 PM

ferece ad Miig Data related to SRS wit Repoe Error - We coider te ettig were a o-tocatic repoe i repreeted by y for N,, N We defie te average repoe a μ y ad te ubject deviatio a N β y μ o tat y μ + β t We aume tat a obervatio i made wit repoe error, ad repreet te k repoe for ubject a Yk y + Wk, μ + β + W were ER ( W k) 0 ad var R ( W k ) σ for,, N, ad te ubcript R deote expectatio wit repect to repoe error We tere are N ubject i te populatio, ad tere i a igle repoe o eac ubject, we repreet te vector of repoe a Yk y + W k Y Yk y + Wk () Y k y + W k Eac row i () correpod to a ubject Notice tat ti repreetatio doe ot iclude ad otio of amplig or miig data We aume tat repoe error for ubject i idepedet of repoe error for ay oter ubject, ad idepedet for differet meaure of repoe owever, we limit dicuio to ettig were k i ubequet dicuio We defie ome additioal otatio to implify expreio for a realizatio of Y Let u idex poible realizatio of Y k by y were,, L For implicity, we aume tat L L for all,, N ombiig tee idea for eac ubject, tere are Radom ariable tat detify te Sample N k N r,, R Π L L poible realizatio of Y We aume tat a imple radom ample of ubject (witout replacemet) i elected from te populatio, ad repoe i oberved for te elected ample Uually, a ample i repreeted a a proper et of ubject Practically, we ca defie a ample a te firt ubject i a permutatio of ubject Uig ti defiitio, te et of poible permutatio defie te et of poible ample, were eac ample i defied a a equece of ubject Eac elemet i a realized ample correpod to a ubject i a poitio i te permutatio Repreetig a ample by a equece eable ubject to be formally liked to radom variable repreetig ample elemet, ad model for tee radom variable, icludig mixed model Notice tat ti way of defiig a ample i differet from defiig a ample a a et of ubject, ice te poitio of te ubject i ot icluded ample defied a et We ote tat defiig poible ample i term of permutatio will iclude multiple of a ample defied by a equece of ubject, ice tere are ( N )! poible equece of te 07ed58doc /9/007 :54 PM

ferece ad Miig Data related to SRS wit Repoe Error - remaiig ubject i multiplicity i ot a complicatig factor ice it occur equally for all equece of ubject Defiig poible ample a a equece will alo iclude multiple of a ample defied a a et, ice eac permutatio of te et i coidered to be differet e umber of differet ample equece from permutatio correpodig to te ame et of ubject i! We ave ot accouted for te fact tat te ame ubject occur i differet poitio i ample equece i developig predictor We repreet te radom variable tat decribe a permutatio a idicator radom variable, U i, were i,, N idexe te poitio of te variable i te permutatio, ad,, N idexe te ubject e value of U i i oe we ubject i elected i poitio i, ad zero oterwie We repreet poible permutatio via a matrix of radom variable, were for example, we N, U U U U U U U U U U For eac radom variable i U tere i a poible repoe tat correpod to te value tat would occur if a ubject wa i a particular poitio i te permutatio For ti reao, te U i are miig data idicator radom variable ere are,, N! poible permutatio, were (we N ) te realizatio of te permutatio repreeted by te radom variable, U, are give by 0 0 0 0 u 0 0, u, u, u 4, u 5 ad 0 0 0 0 u 6 0 0 For example, te repoe aociated wit We defie te ample aociated wit eac u uig te firt row of Subject u Poitio Poitio 0 0 Poitio y + W k i give by 0 y + Wk 0 e value i ti matrix tat are 0 idicate tat y + W k repoe i miig e et of repoe for te poible permutatio are give i able u 07ed58doc /9/007 :54 PM

ferece ad Miig Data related to SRS wit Repoe Error -4 able Poible Permutatio ad Repoe we N Permutatio Realizatio of Poible Radom ariable Permuted Repoe Aociated wit U u U u y + W k y + W k u 0 0 0 y + Wk 0 y + Wk y + W k y W + k y + W k y + W k u y + Wk y + Wk 0 0 0 y + Wk 0 y W + k 0 0 0 y + Wk 0 y + Wk u y + W k y + W k y + W k y W + k 4 0 0 0 yk + Wk 0 y + Wk u 4 y + Wk y + Wk y + W k y W + k 5 y + Wk y + Wk u 5 y + W k y + W k 0 0 0 y + Wk 0 y W + k 6 y + Wk y + Wk u 6 0 0 0 y + Wk 0 y + Wk y + W k y W + k Notice tat eac realizatio of U reult i a limited et of oberved radom variable, ad a relatively large et of miig value (correpodig to ( N ) radom variable) We coider tee broader et of radom variable to develop idea of iferece Permutatio ad Miig Data we tere i o Repoe Error We dicu repreetatio of te amplig problem via permutatio ad te idicator radom variable i ome more detail we tere i o repoe error, limitig te dicuio to te ettig were N Recall tat te permutatio are idexed by,, N! Let u defie a idicator radom variable,, tat a a value of if permutatio i elected, ad zero oterwie e et of radom variable,, for,, N! are multiomial, were P ( ) we eac permutatio i equally likely (wic we aume) We we elect a 07ed58doc /9/007 :54 PM 4

ferece ad Miig Data related to SRS wit Repoe Error -5 permutatio, we realize ( ) Oly oe radom variable i will ave a value of, wile te oter will ave value of zero, idicatig tat te oter permutatio are ot oberved Aociated wit a radom variable i a vector of value, ay x ( x x x N ), were te value i ti vector are repreeted by x i were i idexe te poitio i permutatio We aume tat tee value are o-tocatic We repreet te full radom permutatio radom variable via ( x x x ) i i a N matrix We coider ti to be te complete et of radom variable ' A a example, we N, te trapoe of ti matrix i give by, or x x x x 4 4 x 5 5 x 6 6 x x x x 4 4 x 5 5 x 6 6 x x x x 4 4 x 5 5 x 6 6 f tere i repoe error ad te value x i repreet a realized repoe, te it i poible tat all te value of x i may be differet i would occur, for example, if a differet realizatio of repoe occurred for differet permutatio ' All of te radom variable i are potetially obervable ice ay ca ave a ' value of oe a realizatio of, all of te radom variable, for,, are oberved but all but oe of teir oberved value are zero (correpodig to a realized value of of zero) We te realized value of i zero, altoug vx i 0, it will ot be poible to oberve te value of x i directly Let u refer to ti kid of miig data a directly miig Samplig, Permutatio ad Potetially Obervable Radom ariable We are itereted i ettig were a fixed ize ample i elected ad oly te value i te ample are oberved For eac, we defie te ample a te firt i,, value i x, x, ad partitio x ito two part correpodig to te ample ad remaider e value x, of te variable i te remaider, x i for i >, will ot be oberved, eve if te realized value of i oe Let u refer to ti type of miig data a itetioally miig Notice tat ' radom variable i tat are itetioally miig may ever be oberved, eve toug te value of may be oberved We ow coider te idea of potetially obervable i te cotext of amplig, ad ' dicu wat i potetially obervable e variable i tat are itetioally miig may ever be oberved, ad ece are ot potetially obervable t eem logical tat if a variable i ot potetially obervable, it ould play o role i iferece We do ot attempt to prove ti ituitio, but accept it a valid i idea may be related to Erico 988 commet o te limitatio of te uperpopulatio framework for iferece, ice e ee o reao to iclude 07ed58doc /9/007 :54 PM 5

ferece ad Miig Data related to SRS wit Repoe Error -6 itetioally miig variable i a model framework t alo may be related to te idea of acillary data ice it eem tat itetioally miig variable play o role i iferece t ' may be poible to partitio ito two part, ad ow uig te Rao-Belloue teorem tat te portio tat i itetioally miig doe ot cotribute to iferece about ay liear combiatio of potetially obervable radom variable All of tee area require more reearc ' We aume tat variable i tat are itetioally miig may ever be oberved, o tat te potetially obervable radom variable are give by x x x x 4 4 x 5 5 x 6 6 x x x x 4 4 x 5 5 x 6 6 omplete Data for Potetially Obervable Radom ariable We defie te complete data to be te value of te potetially obervable radom variable we eac i oberved We coider a radom variable to be oberved we it realized value i oe e complete data i te give by te value give by x x x x4 x5 x6 x x x x x4 x5 x6 We repreet colum i of x by x i o tat x ( x x ) Uig tee expreio, 0 ( N ) vec ( ) vec 0 N ( N ) ( x) 0 vec( ) ( N ) x 0 N vec( ) N x ( ) e potetially obervable radom variable are give by vec ( ) 0 vec( ) ( N ) x vec( ) x x x x 07ed58doc /9/007 :54 PM 6

ferece ad Miig Data related to SRS wit Repoe Error -7 Potetially Obervable Radom ariable ad omplete Data wit No Repoe Error We ow aume tat tere i o repoe error ti ettig, ome of te value of x i are te ame a value of x i for ad i i Notice tat ti ame ituatio may occur if te value of x i repreet te realized value for a ubject wit repoe error, ad te permutatio wa made after te value wa realized owever, if after te permutatio i made, te value are realized, te te value of x i are ot likely to be te ame a toe of x i for ad i i We tee i o repoe error, we ca repreet tee relatioip by defiig ubject label, ad keepig track of wic ubject i i wic poitio i a give permutatio We cooe te permutatio for to defie label for ubject uc tat x y, were elemet are defied uc tat xi y were i Baically, we ue te poitio label for to defie te ubject label Wit ti repreetatio, we ca repreet te patter implied by te permutatio for te x i i term of te value of y We N ad, te potetially obervable radom variable are give by y y y y 4 y 5 y 6 y y y y 4 y 5 y 6 e complete data i y y y y y y y y y y y y y t i poible to expre te relatioip betwee x ad y t i give by x ( y y y), were i i a N matrix tat map te relatioip betwee te value of x i ad y We N ad, ad 0 0 o tat 0 0 07ed58doc /9/007 :54 PM 7

ferece ad Miig Data related to SRS wit Repoe Error 07ed58doc /9/007 :54 PM -8 8 ( ) 0 0 0 0 0 0 0 0 vec x y y were e matrix tat i multiplied by y i kow ad o-tocatic A a reult, te potetially obervable radom variable are give by ( ) ( ) vec vec x y We expre te product of i we N ad We i, 4 5 6 We i, 4 5 6 ee defiitio give rie to te potetially obervable radom variable

ferece ad Miig Data related to SRS wit Repoe Error -9 Propertie of te Radom ariable y y y y 4 y 5 y 6 y y y y 4 y 5 y 6 e radom variable are multiomial, wit eac var ( ), ad ( ) cov, var ( ) J, were N! Patter i Beroulli, uc tat ( ) A a reult ( ) E ad E, ere are patter i te variable i For example, tere are! permutatio wit te ame ample ubject i implie tat tere are permutatio wit uique et of! ubject Suppoe tat we order te ubject i a et from mallet to larget label Let g idex te permutatio for te et, were we, g ; we, g ; ad we 4, g e radom variable repreetig tee uique et are give by g e radom vector ( ) i multiomial, were E ( ) var ( ) J, were! y y 0 0 Notice tat y y, wile y y, were ad y 0 0 y 0 0 07ed58doc /9/007 :54 PM 9

ferece ad Miig Data related to SRS wit Repoe Error -0 0 0 geeral, we repreet te matrix tat idetifie ow ubject correpod to c c c poitio i te implified et of permutatio by g, were g c c c e c c c vec ( ) g g y or vec ( ) y y g Liear ombiatio of Radom ariable We defie quatitie of iteret i term of liear combiatio of te vector of radom Lvec We dicu two poible liear combiatio variable give by ( ) vec, give by ( ) e firt i a liear combiatio formed by addig all radom variable e ecod, i a liear combiatio tat correpod to a ubject Firt, uppoe we are itereted i a liear combiatio tat correpod to te um of all P vec were radom variable oider te um of all radom variable, ( ) ( ) L ( ) Uig te expreio for vec( ), Now E ( ) y g y y ( ) P ( ) y were A a reult,! y E( P) ( ) y i y i e expreio i i a cotat vector of dimeio N Uig i ad 0 0 07ed58doc /9/007 :54 PM 0

ferece ad Miig Data related to SRS wit Repoe Error - 0 0, ( 0) ad ( ) 0, o tat i N A a i reult, E ( P ) N y Sice N ad,, ad E( P) μ More! ( N )! geerally, i ( N ) N A a reult, E( P) N y μ i ( N! )! Next, we coider a liear combiatio of radom variable tat correpod to a particular ubject We coider uc a liear combiatio we N ad Firt, recall tat g 0 0 vec ( ) y y 0 0 g f we are itereted i te firt ubject (ie te ubject labeled ), we defie a liear L vec were ( ) L f iteret i i te ecod ubject, combiatio equal to ( ) we defie L ( 0 0 ), ad for te tird ubject, we defie ( ) Uig Sice E ( ) 0 0, E( vec( )) 0 0, E ( L vec( )) y! L ( ) E L vec y L L y or ( ) Liear ombiatio wit a Populatio of Size N ad a Sample of ize 07ed58doc /9/007 :54 PM

ferece ad Miig Data related to SRS wit Repoe Error - oider a ettig were we ave imple radom ample witout replacemet of ize g g from a populatio of ize N ti cotext, we defie vec ( ) y, were te g ci ci ci i i i matrice c c c i are N matrice tat idetify te memberip of ci ci cin ubject,,, N, to ubet, g,,, were! Suppoe we are itereted i a liear combiatio tat correpod to te um of all radom variable give by P ( ) vec( ) were L ( ) Sice E ( ) were ( ), E vec( ) y o tat E( P) i y ti i expreio, repreet te umber of poible ubet of ubject i te ample We tere are N ubject i te populatio, ad te ample i of ize, te umber of poible ubet i e matrice i idetify for te implified et of poible ubet (correpodig to! collapig poible permutatio from poible permutatio to poible value) wic! ubet cotai wic ubject N ( N )! For example, wit N 4 ad, tere are 6 poible differet! ubet of ubject, correpodig to te ubet, g wit ( ), g wit ( 4) ad g 6 wit ( 4), g, 4 g wit ( ) g wit ( ), g 5 wit ( 4), We ca tik of a ample a te realizatio of a ubet of ubject dexig te poible realizatio by g,, wit idicator radom variable correpodig to g, 4 5 6 4 4 5 4 6 4 07ed58doc /9/007 :54 PM

ferece ad Miig Data related to SRS wit Repoe Error - 0 0 0 0 0 0 We N 4 ad, L ( 6) ad wile 0 0 0 0 0 0 0 0 0 0 0 0 ece g wile g were 0 4 4 0 0 5 0 5 6 0 0 6 vec ( ) g g y We predictig te total of all radom variable, E( P) i y ti i i N N i ettig, ( ), wile ( ) A a reult, ( ) 0 0 ( N ) or! i 4 A a reult, E( P) N y μ μ i ( N! )! A a reult, N 4 ad, E( P) μ Next, we coider a liear combiatio of radom variable tat correpod to a particular ubject f we are itereted i te ubject, we defie a liear combiatio equal to P vec L 0 For example, we N 4 ad ( ) L were ( ), ad if we are itereted i a liear combiatio correpodig to ubject, ice 6, L ( 0 ) 4! N 07ed58doc /9/007 :54 PM

ferece ad Miig Data related to SRS wit Repoe Error -4 We ca evaluate te expected value of P Lvec( ) uig E vec( ) y Multiplyig L by E vec( ) i equivalet to ummig te value i colum of i um i equal to te ame value for all,, N, ad i equal to te umber of differet ubet of ize tat ca be elected from te populatio, after omittig ubject from te N ( N! ) populatio i umber i equal to A a reult, (! ) ( N )! E L L y ( vec( )) We N 4 ad, ti i equal to ( N! ) ( ) ( N )!! y!! ( N )! ( ) ( ) y N!! N! y N ( ) ( ( )) N( N ) ( ) Predictio i te otext of Permutatio E L vec y y y! 4 4 e ext area tat we dicu i predictio of a liear combiatio of radom variable e baic pla for predictio i to repreet te radom variable i two et, oe of wic will be realized Uig te joit ditributio of te radom variable, we form te bet liear ubiaed predictor followig a pla imilar to Royall We coider te ettig were N ad to develop te predictor, ad te expad to more geeral ettig Recall tat we N ad, te radom variable tat repreet uique et of ubject are give by g 07ed58doc /9/007 :54 PM 4

ferece ad Miig Data related to SRS wit Repoe Error -5 e radom vector ( ) i multiomial, were E ( ) var ( ) J, were Let u repreet te elemet i te matrix by! gi uc tat e idice are iterpreted a te ubet ad te poitio Notice tat g 0 0 vec ( ) or alteratively, vec ( ) y y 0 0 g t i alo valuable to expre vec ( ), were y D y, 0 0 0 0 y D y, ad y D y o tat D vec ( ) D y D We wi to repreet radom variable i te ample ad remaider Firt, uppoe tat te ample correpod to te ubet repreeted by g e ample ad remaider i give by 07ed58doc /9/007 :54 PM 5

ferece ad Miig Data related to SRS wit Repoe Error -6 wic i equal to 0 0 0 vec ( ) 0 0 0 K vec( ) K 0 0 We g, te ample ad remaider are give by 0 0 0 K vec ( ) 0 0 0 wic i equal to 0 0 vec( ) K We g, te ample ad remaider are give by 0 0 0 K vec( ) 0 0 0 07ed58doc /9/007 :54 PM 6

ferece ad Miig Data related to SRS wit Repoe Error -7 wic i equal to vec ( ) 0 0 K Eac of tee repreetatio partitio radom variable i te uual maer ito toe tat will be realized via electio of a ample, ad toe tat will ot be realized D We ue te expreio for vec ( ) D y to re-expre te partitioed radom D variable Before doig o, let u expre y D y a 0 0 vec( ( )) vec vec (( Dy) ) ( yd ) For example, yd ( y y y ) 0 ( y y ) 0 A a reult, vec ( yd ) vec( yd ) 0 0 vec( ) vec( y D ) 0 vec( y D ) 0 vec ( ) vec( ) yd yd or vec( ) vec( g ) y D Our goal i to eparate te radom variable i vec ( ) ito et correpodig to te ample ad te remaider e ample will be te ubet correpodig to te permutatio elected, ad ca be idetified oly by kowig te realizatio of Wat are te remaiig radom variable after realizig? Perap a way of tikig of te remaiig radom variable i i term of We we realize, ome of te value of te remaiig radom variable will be kow, ad oter will be miig 07ed58doc /9/007 :54 PM 7

ferece ad Miig Data related to SRS wit Repoe Error -8 Now, we g, partitioig K 0 0 K K K vec( y Dg) vec( ) vec( g ) K y D K 0 0 K vec ( Similarly, partitioig K y Dg) K, Kvec ( ), K vec ( g ) K y D K vec ( wile partitioig K y Dg) K, Kvec ( ) K 0 vec 0 ( g ) K y D Kg vec ( y Dg ) geeral, Kgvec( ) g vec ( g ) K y D order to evaluate te BLUP of P vec( ) ( ) vec L L, we re-arrage te radom variable ito te ample ad te remaider Sice vec( ) correpod to a lit of all poible ubet of te populatio, ad oly oe ubet will correpod to te ample, te rearragemet will deped upo wic ubet i te ample ubet i correpod to te realizatio of For ti reao, te re-arragemet will deped o wic ubet i coe, ad ece te re-arragemet matrix will deped o g, te elected ubet We expre K g vec( y Dg ) P LK gvec( ) L K g vec ( y Dg ) dimeio, wile g vec( g ) L L L Notice tat g vec ( g ) K y D i of dimeio ( ) of are tocatic We expre ( ) K y D i of Oly te elemet 07ed58doc /9/007 :54 PM 8

ferece ad Miig Data related to SRS wit Repoe Error -9 We coider a example to illutrate tee idea Firt, uppoe tat P correpod to te um of all radom variable, uc tat P ti ettig, for all g,,, ( ) ( ) ( ) L L We N ad i gi, ( ) ( ) Lg L g 4 4 Summary of Fiite Populatio Miig Data Objective e mai objective of ti documet i to relate tee idea of miig data i te cotext of a fiite populatio mixed model to miig data cocept ad term ued i te literature wit a potetially obervable repoe (or couter-factual) miig data framework Suc a framework a bee developed ad popularized by Little ad Rubi (00) Before dicuig te miig data framework of Little ad Rubi i ti cotext, we metio te relatioip betwee uig a radom permutatio to repreet poible ample, ad odambe (955) dicuio of iue faced i amplig Sample ca be decribed a partial realizatio of permutatio of populatio ubject t i ot eceary tat eac permutatio be equally likely, altoug imple radom amplig a ti property f aociated wit eac permutatio we defie a idicator radom variable for te permutatio, tere will be idicator radom variable, ubject to te cotrait tat for ay ample, oly oe of te idicator radom variable a a value of oe (ie, teir um i oe) e cotrait implie tat repreetig amplig a poible permutatio of te populatio ca be accomplied by defiig idicator radom variable For example, we N, five correlated idicator radom variable defie a radom permutatio model We ca tik of ti a a multiomial model, were te categorie correpod to permutatio ere are N radom variable cotaied i U, or ie radom variable we N owever, tee radom variable pa oly a four dimeioal pace, ice te radom variable are ubject to te cotrait tat for all realizatio, eac row um to oe, ad eac colum um to oe A a reult, repreetig te radom permutatio model i term of te idicator radom variable defied i U iclude oly a ubet of te radom variable i te more geeral fully pecified radom permutatio model ued by odambe (955) We ave ot ivetigated uder wat coditio tere i iformatio lot i reducig te dimeio of te radom variable from to ( N ), but ti may be valuable to ivetigate i te future e begiig of uc a ivetigatio i give i c07ed59doc 07ed58doc /9/007 :54 PM 9

ferece ad Miig Data related to SRS wit Repoe Error -0 Referece ocra, W (96) Survey Samplig, Jo Wiley ad So, New York Erico, (988) odambe, P (955) A uified teory of amplig from fiite populatio, Joural of te Royal Statitical Society B7: 69-78 Little, RA, ad Rubi, D B (986) Statitical Aalyi wit Miig Data, Firt Editio Jo Wiley ad So, New York Little, RA ad Rubi, DB (00) Statitical Aalyi wit Miig Data, Secod Editio Jo Wiley ad So, New York Royall, R (970) O fiite populatio amplig teory uder certai liear regreio model, Biometrika 57: 77-87 Royall, RM 988) e predictio approac to amplig teory : adbook of Statitic olume 6 Samplig (Kriaia, PR, ad Rao, R, Ed), 99-4 Nort-ollad, Amterdam Rubi, DB (975) Bayeia iferece for cauality: e importace of radomizatio Proc Social Statitic Sectio, Am Statitic Aoc pp-9 Rubi, D B (976) ferece ad miig data, Biometrika 6, 58-59 Särdal, -E, Sweo, B, ad Wretma, J (99) Model Aited Survey Samplig, Spriger-erlag Staek, EJ ad Siger, JM (004) Predictig radom effect from fiite populatio clutered ample wit repoe error, Joural of te America Staittical Aociatio, 99: 9-0 Staek, EJ, Siger, JM ad Lecia, B(004) A uified approac to etimatio ad predictio uder imple radom amplig, Joural of Statitical Plaig ad ferece, : 5-8 07ed58doc /9/007 :54 PM 0