Sampling, WLS, and Mixed Models Festschrift to Honor Professor Gary Koch

Size: px
Start display at page:

Download "Sampling, WLS, and Mixed Models Festschrift to Honor Professor Gary Koch"

Transcription

1 Samplig, WLS, ad Mixed Models Festschrift to Hoor Professor Gary Koch Edward J. Staek Departmet of Public Health Uiversity of Massachusetts, Amherst, MA 40 Arold House 75 N. Pleasat Street Uiversity of Massachusett, Amherst MA ad Julio M Siger Departameto de Estatística Uiversidade de São Paulo, Brazil uliosiger@gmail.com Ruig Title: Samplig, WLS, ad Mixed Models KEYWORDS: best liear ubiased predictors, latet values, predictio, shrikage, superpopulatio, desig-based iferece C09ed3v6v.doc /8/009 5:5 PM

2 Abstract Mixed models may be defied with or without referece to samplig or samplig radom variables, ad ca be used to predict realized radom effects, as for example whe estimatig the latet values of study subects measured with respose error. Whe the model is specified without referece to samplig, a simple mixed model icludes two radom variables, with oe stemmig from a exchageable distributio of latet values of study subects ad the other from the study subects respose error distributios. Positive probabilities are assiged to both potetially realizable resposes ad artificial resposes that are ot potetially realizable, resultig i artificial latet values. cotrast, fiite populatio mixed models may be defied to represet the two-stage process of samplig subects ad measurig their resposes, where positive probabilities are oly assiged to potetially realizable resposes. A compariso of the estimators over the same potetially realizable resposes idicates that the optimal liear mixed model estimator (the usual best liear ubiased predictor, BLUP) is ofte (but ot always) more accurate tha the comparable fiite populatio mixed model estimator (the FPMM BLUP). The example provides the basis for a broader discussio of the role of coditioig, samplig, ad model assumptios i developig iferece. C09ed3v6v.doc /8/009 5:5 PM

3 troductio: Advaces i public health ad health sciece are tied to uderstadig practical implicatios of chages i policy, programs, uderlyig causes of diseases, prevetio, ad/or treatmet (Koch et al. (980)). Uderstadig the impact of such chages is the focus of much of Biostatistics. Not oly does Biostatistics embrace the theoretical uderpiigs of statistical modelig, but it seeks to tie the results of studies to reality. t is this struggle that has bee the focus of much of Koch s work, ad cotiues to be of compellig iterest. For this reaso, samplig plays a importat role i may applicatios, sice estimates are eeded for real populatios. This is ot a simple process, because it ivolves recocilig seemigly ad hoc approaches, such as i Koch s (967) procedure to estimate the populatio mea, with the fudametal basis of iferece from survey samplig, whe exteded, for example, to respose error (Koch, 973), ad to model based approaches. We cosider a simple settig that we feel challeges the depths of uderstadig statistical iferece, amely, estimatig the latet value of some characteristic (eg., cholesterol) for a subect. A practical example is the Seasos Study Merriam et al. (999), Ockee et al. (004), where three 4- hour recall dietary iterviews were collected o each study subect i each seaso of a year to evaluate seasoal cholesterol chages, cotrollig for the cotributio of saturated fat itake. The 4-hour recalls were used to estimate the average saturated fat itake for each subect i the six weeks prior to cholesterol measure (the latet value). Average saturated fat itake, ad the estimated stadard deviatio for 554 study subects are displayed i Figure ad reveal that both the latet value ad the variace i saturated fat itake are likely to vary amog subects. Rather tha usig the simple average saturated fat itake for the seaso to estimate a subect s latet saturated fat itake (which is the best liear ubiased estimator from a respose error model), a more accurate estimator is the BLUP from a mixed model (MM), obtaied by replacig the subect effect by a radom effect. Although the MM-BLUP is commoly used to estimate a subect s latet value, a close examiatio reveals that a portio of the MM sample space is artificial ad ot potetially realizable. This motivates a re-examiatio of the associated sample space C09ed3v6v.doc /8/009 5:5 PM 3

4 ad the criteria used to evaluate its performace. t also provides a cotext for compariso with the FPMM ad FPMM-BLUP cosidered by Staek ad Siger (004), where all sample poits are potetially realizable. We discuss these issues i the cotext of a simple problem. First, we develop a MM for a set of subects whose resposes follow a simple respose error model by replacig the subect effect by a radom effect. We use this setup to develop the MM-BLUP. Via a simple example, we describe the sample space for the MM, ad distiguish the (artificial) MM-latet values from (potetially realizable) subect latet values, ad MM-resposes from potetially realizable resposes. We cotrast this developmet with a FPMM mixed model that is based o a simple radom sample of a populatio, icludig a respose error model for sample subects resposes, ad the correspodig FPMM-BLUP. We coclude with a discussio of the coectios betwee the issues raised i the example ad some broader ideas. Estimatig Latet Values i The Mixed Model May frameworks ca be used to develop a best liear ubiased predictor as discussed by Robiso (99). Although with appropriate assumptios, the expressio for the BLUP may be idetical whe motivated from differet frameworks, the differeces betwee them are importat i uderstadig the coectio betwee the physical problem, the stochastic model, ad the solutio. These coectios are importat sice they facilitate the iterpretatio of the results. The MM framework we use begis with a additive respose error model for each subect, ad assumes exchageability of the correspodig latet values. The study subects costitute a set that may or may ot have bee obtaied as a result of a probability sample from a populatio. the Seasos Study, for example, the study subects costitute a subset of members of the Fallo Health Maiteace Orgaizatio (HMO) who voluteered to participate, ad ot a probability sample of Fallo HMO members. C09ed3v6v.doc /8/009 5:5 PM 4

5 We start with a set of subects, labeled =,..., ad assume that repeated resposes, Y k, k =,..., r s, are associated with subect. The data for the set correspod to the pairs,, ( Y Y Yr ), =,...,. We assume that the resposes associated with subect are idepedet ad correspod to idetically distributed radom variables Y k, k =,..., r, ad defie E R ( Y k ) = y as the latet value for var R Y k σ subect ; we also deote the correspodig respose error variace by ( ) =. The subscript R idicates expectatio with respect to the distributio of the respose error. For simplicity, we cosider a sigle measure for subect ad drop the subscript k so that the respose error model may be writte as Y = y + E. () Whe r >, Y ad E correspod to the average respose ad average respose error, respectively, ad σ represets the variace of these averages, which we assume kow. The latet values, y, are the parameters of iterest. Without additioal assumptios, the respose for subect, amely, Y, is the best estimator of the subect s latet value. We costruct a MM by addig to the respose error model the assumptio that the latet values for the subects are a realizatio of = ( ) P P P P, a vector of exchageable radom variables whose possible equally likely values iclude (but are ot restricted to) the latet values of the subects i the set. The latet values uderlyig the radom variables i P could be solely the = latet values of the subects i the set, the N = latet values of subects i a populatio from which the subects were sampled, or some other set of values, iclusive of the latet values of the subects i the set, which C09ed3v6v.doc /8/009 5:5 PM 5

6 we may refer to collectively as a superpopulatio. Let us defie Eξ ( P ) γ = E ξ ( P ) μ μ = ad, where the subscript ξ idicates expectatio with respect to the distributio of latet values. The term μ is the mea latet value of the subects i the set whe =, the mea latet value of subects i the populatio whe N =, or the mea latet value of subects i the γ = y μ superpopulatio. Whe =, ( ) while whe N =, γ = ( y ) s μ = where s =,..., N label subects i a fiite populatio. N, N s= A realizatio of P is a MM-latet value which we re-parameterize as P a = μ +, =,...,, where ξ ( a ) =, ad Eξ ( aa ) γ E 0 = whe =, or Eξ ( aa ) = γ otherwise. Oe possible realizatio of P is = ( ) y y y y. The MM is give by Y = μ + a + E () or i matrix form, by where = ( ) = μ + + Y X Za E Y Y Y Y, X=, a colum vector with all elemets equal to, Z=, a idetity matrix, a = ( a a a ), the vector of radom effects, ad E = ( E E E ), a vector of respose errors. the MM, E ξ R ( ) Y = X μ, while var R ( Y ) ξ = Ω, where Ω= Γ + σ, = Γ = γ J, with J =, ad σ deotes a = C09ed3v6v.doc /8/009 5:5 PM 6

7 matrix with diagoal elemets σ ad off-diagoal elemets equal to zero. Every realizatio of Y i () correspods to a idetical realizatio of Y i (), but ot vice-versa. Let the target correspod to T = g P. Whe g = e, a vector whose elemets are all equal to zero except the elemet i row that is equal to oe, the target is P. The correspodig BLUP is a liear fuctio of Y that is ubiased ad has miimum expected mea squared error (MSE). We may show (see Appedix A for details) that the BLUP of P is ( μ ) P = μ + k Y (3) where μ is the weighted least squares (WLS) estimate of the mea, i.e., μ = wy, with = w = γ + σ = γ + σ, ad k γ = γ + σ. The expected MSE of the predictor is give by ( P P ) = σ k + varξ R k k, where k k = =. t is smaller tha σ, the expected MSE attaied whe we use the subect s respose as a estimate of the subect s latet value, (see Appedix B). With this uderstadig, P may be cosidered a better estimator of Assumig that all realizatios of the oly realizatio of T that i reality ca occur. y tha the observed respose, P are distict, otice that the latet value of subect is give by Y. y is The mixed model give by () is defied for a set. t is ot ecessary for the set to have bee selected via probability samplig from a larger populatio, ad i fact, the set could iclude all subects i a populatio. A alterative mixed model cosiders the data to be the realized respose of a radom C09ed3v6v.doc /8/009 5:5 PM 7

8 sample of subects from a (fiite) populatio (Staek ad Siger 004). We refer to this model as the fiite populatio mixed model (FPMM), ad ote that the BLUP give by (3) is ot the same as the FPMM-BLUP. Before cosiderig these differeces, we discuss a simple eumerative example for the MM. Examples The data correspod to = subects: Daisy, labeled = ad Rose, labeled =, summarized i Table. - isert Table here Note that the respose error variace differs betwee subects. We assume that for each subect, =,,, respose error ca take o two equally likely values correspodig to σ or σ. Uder the respose error model (), each respose (correspodig to a pair of values for the sample set) is equally likely with probability ¼. With these assumptios, we display i Table the potetially realizable resposes correspodig to the four combiatios of respose error. - sert Table here- The potetially realizable resposes i Table coicide with the resposes (which we idex by t ) for the MM whe the realizatio of P (the MM-latet values) is y. Sice the latet values are assumed to be exchageable ad =, there are two possible realizatios of P. Resposes for the other realizatio of the MM-latet values are listed i Table 3. C09ed3v6v.doc /8/009 5:5 PM 8

9 - sert Table 3 here- The resposes for the MM listed i Tables ad 3 correspod to the equally likely realizatios, t =,...,8, of Y, each occurrig with probability /8. The correspodig realizatios, t Y t, for P, t =,...,8, of P are the realized MM-latet values. Whe t =,...,4 (as i Table ), the realizatios of P ad correspod to realizatios of y ad Y i (), respectively. For such data, Daisy s realized MM-latet value is 0. The P ad Y are artificial whe t = 5,...,8 (as i Table 3). For these values of t, Daisy s realized MM-latet value is. Y The BLUP of P uder the MM give by (3) for each realized MM-respose is give i Table 4, where the first pael correspods to Daisy, ad the secod pael, to Rose. The differeces, P P, ad the correspodig squared differeces are give i last two colums of each pael. Notice that the average differece is zero, satisfyig the ubiased costrait give by E R( P P) 0 ξ =. The average squared differece, or MSE, is 0.99 for Daisy ad 3.77 for Rose. These values are smaller tha those that would result from a best liear ubiased estimator (BLUE) usig model (), amely σ = for Daisy, ad σ 3 = 4 for Rose. -sert Table 4 here- There are some problems with these results which ca be illustrated by focusig o the MM-resposes for Daisy (first pael of Table 4). Notice i Table, Daisy s latet value is 0; i Table 4, is also listed as a latet value for Daisy. The MM-latet value of for Daisy (correspodig to the MM-resposes t = 5,...,8 ) exists oly i the mixed model, ot i reality. Such a latet value is artificial, ad oe could C09ed3v6v.doc /8/009 5:5 PM 9

10 argue that it should ot be give a positive probability i the aalysis. This is ot due to ambiguity over which subect is labeled =, sice this label oly correspods to Daisy i the model defiitio. These results shed light o the iterpretatio of bias ad o the defiitio of the MSE for the MM whe the target is the subect s latet value. Whe y is the target, the bias is determied by subtractig the subect s actual latet value from P i all settigs as show i Table 5. Usig the subect s actual latet value, the BLUP give by (3) is biased for each subect, ad its MSE is larger tha the MSE of the BLUE based o model (). -sert Table 5 here- the MM, positive probability is give to MM-resposes that are ot potetially realizable. By averagig over these artificial resposes i additio to the potetially realizable resposes, the coectio betwee the MM ad reality is broke. This creates cotradictios i the iterpretatio of results. For example, the latet value for Daisy ( = ) is 0 for all potetially realizable resposes, but the expected value of the correspodig MM-latet values is E ( P) ξ R = 6. To retai the iterpretatio that the realized MM-latet value is the latet value for the subect, oly the MM sample poits that are potetially observable (i.e., correspodig to t =,...,4 ) should be give positive probability. Restrictig evaluatio of (3) to potetially observable resposes provides some isight o the bias ad the MSE. The coditioal bias is give by ( P P = ) = ( k )( y μ ) E P y, R (see Appedix C). the example, the coditioal bias for Daisy is -0., while the coditioal bias for Rose is The average coditioal bias over the subects is ot equal to zero, although the limit of the C09ed3v6v.doc /8/009 5:5 PM 0

11 bias with icreasig umbers of measures of respose is zero, sice ( k ) defiitio for the MSE (see Appedix C), i.e., (( ) ) ( ) ( ) lim =. Usig a similar r ( k ) E R P P P= y = k w σ + y μw + kσ +, (4) = k where μ = w y, it follows that the MSE for Daisy is ad for Rose, both smaller tha w = the MSE of the simple resposes, Y, =,. Estimatig the Mea Latet Value Our developmet has focused o estimatig the MM-latet value for a subect. We ca use similar methods to obtai a estimate of T = g P where g =, a target that correspods to the average MM-latet value, P. The correspodig BLUE is the weighted least squares (WLS) estimator give by μ. Whe > N, Eξ ( T ) is the mea of the MM-latet values i the superpopulatio, while whe = N, Eξ ( T ) is the mea latet value i the populatio. Whe =, P is equal to y = y =, the mea of the latet values i the respose error model (). The BLUE of y i () is the mea respose, Y Y = =. Sice P = y, it is temptig to compare the BLUE obtaied uder model () with the BLUP obtaied uder model () whe =, as illustrated i Table 6. -sert Table 6 here- C09ed3v6v.doc /8/009 5:5 PM

12 Uder model (), there are o resposes comparable to the MM-resposes for t = 5,...,8. This is a cosequece of the iclusio of artificial resposes i the MM. The target, P, is costat over all possible MM-resposes. f we defie a estimator similar to Y for the MM-resposes as Y the the MSE of Y = =, Y ad μ ca be evaluated over the same sample space. The MSE of μ, give by ( ) k ξ μ P = γ, is smaller tha the MSE of k E R y = = MSE of Y uder model (), i.e., ( ) of μ istead of Y. E R Y Y, give by E ( ) R Y P ξ which equals the σ, providig the usual ustificatio for the use Evaluated over potetially realizable resposes, i.e. those correspodig to t =,...,4, the bias of the WLS estimator is E ( ξ R ) k k P T P= y = y. The ubiased property of the WLS estimate = k of the average latet value holds oly whe expectatio is take over all MM resposes, icludig those artificial resposes that are ot potetially realizable. The MSE, evaluated oly over potetially realizable resposes, is ( = ) = σ + ( ) MSE R P T P y ξ w w y =. = Whe =, as i the example illustrated i Table 6, this expressio simplifies to ( ) k ξ μ P = γ. Whe >, as illustrated ext, the coditioal MSE of the MM-BLUP is k E R ot equal to its ucoditioal MSE, ad may be larger (or smaller) tha the ucoditioal MSE. C09ed3v6v.doc /8/009 5:5 PM

13 A Slightly Larger Example. Although i the first example with =, it was possible to eumerate all outcomes, some issues that occur more geerally could ot be revealed. We briefly discuss a secod example where = ad the data correspod to = 3 subects, Daisy, Rose, ad Lily which we label =,..., = 3, respectively. We assume that respose error for subect ca take o two equally likely values correspodig to σ or σ. With these assumptios, there are eight equally likely possible potetially realizable resposes correspodig to the differet combiatios of respose error (Table 7). -sert Table 7 here- The t =,...,8 potetially realized resposes i Table 7 are possible resposes for the MM whe the realizatio of P (the MM-latet values) is y. Sice the latet values are assumed to be exchageable ad = 3, there are six possible realizatios of P. Replacig y by each of these realizatios gives rise to 40 artificial resposes that are ot realizable, but are icluded with positive probability i the MM. The predictor of P give by (3) uder the MM for Daisy is listed for t =,...,8 i Table 8, ad for t = 9,...,48 i Table 9. We summarize the results for the MM-BLUP of each subect i Table 0. -sert Table 8-0 here- Notice that whe averagig over the potetially realizable resposes ( t =,...,8 ), the MM-latet value is the subect s latet value. The average squared differece betwee the MM-BLUP ad the MM-latet value for the potetially realizable respose is larger tha a similar average over the o-realizable resposes for Daisy ad Rose, but ot for Lily. t is the overall average MSE (over t =,...,48 ) that is C09ed3v6v.doc /8/009 5:5 PM 3

14 usually evaluated for the MM, eve though such a average icludes resposes that are ot potetially realizable. Sice =, P is a costat i this example give by P = 5. t is of value to cosider the MM-BLUP estimator of P. Over the potetially realizable resposes ( t =,...,8 ), the average of μ is 6.009, while over the o-observable resposes, the average is Although the simple average of μ over all MM-resposes is equal to P, this ubiased result oly occurs oly if the artificial resposes are icluded. The average MSE for the potetially realizable resposes ( t =,...,8 ) is give by.667, while is the average MSE for the artificial resposes ( t = 9,...,48 ). The average MSE (over allt =,..., 48 ) give by 3.48 is larger tha the average MSE for the potetially realizable resposes, but smaller tha the comparable average MSE uder the respose error model give by.667. The Fiite Populatio Mixed Model We ow cosider the data to be the realized respose of a simple radom sample of subects from a fiite populatio, assumig a sigle respose for each sample subect. We represet subects, latet values, ad respose usig similar otatio as i model (). We defie the populatio as a set of N labeled subects, assigig the subscript s as a label to subects placed i alphabetical order by ame, ad represet the N vectors of latet values ad respose errors by y ad E, respectively. With this N defiitio, μ = ys N correspods to the usual fiite populatio mea, while N N γ, where γ s= s= N = ( y ) s μ correspods to the usual fiite populatio variace. We defie Y as a N N respose vector with elemets Ys = ys + Es, s =,..., N, so that the respose error model for the populatio is give by C09ed3v6v.doc /8/009 5:5 PM 4

15 Y = y + E (5). We defie a sample as a sequece of subects, ad use i=,..., to idex the subects i the sequece. We idex the possible sequeces of subects by h, where h =,..., H N! ad H = ( N ). Let! yhi deote the latet value for the subect i positio i i sequece h ad defie the sample vector of latet values by y = ( y y y ) proposed by Godambe (955). h h h h. This geeral represetatio of a sample is similar to that We defie respose for sequece h by Y = u Y h h so that the elemetyhi deotes respose for the subect i positio i i sequece h, Y = ( Y Y Y ), ad u = ( u u u ) h h h h a matrix of costats with colums give by u = ( u u u ) hi hi hi hin is h h h h for i =,...,. The elemet uhis has a value of oe if subect s is i positio i i sequece h, ad zero otherwise. For example, whe = ad N = 3, the data for sequece h cosistig of subect s = 3 followed by subect s = is (( s = 3, Y ) ( s =, Y h h) ) so that u u 0 h u h = u u = 0 0. Latet values ad respose errors u u 0 h 3 h 3 h h h for the subects i sequece h are defied i a similar maer by y = u y ad E = u E, respectively. h h h h While it is possible to relate the respose for the subect i positio i i sequece h to the respose for subect defied by the respose error model () (see Appedix D), it is importat to ote that the subect i positio i =, for example, i sequece h is ot ecessarily the same subect as the subect labeled = i model () sice for oly oe sequece will the order of the subects i () match the subect s positio i the sequece. C09ed3v6v.doc /8/009 5:5 PM 5

16 the fiite populatio mixed model we assume that a sample correspods to a radomly selected sequece. To formalize this, we let h represet a idicator radom variable that has a value of oe whe sample sequece h is selected, ad zero otherwise, ad let Y H = h Yh h =. We assume that all sample sequeces are equally likely (correspodig to simple radom samplig without replacemet), so that Ep ( ) h = H (where the subscript p idicates expectatio with respect to samplig). Lettig H U = u with elemets U h h h = H = u, i =,...,, s =,..., N, it follows that is h his h = Y = U Y is a vector of sample radom variables, Y s s N i UisYs s= =, i=,...,. Usig (5) ad defiig subect effects by β = y μ, s =,..., N, the fiite populatio mixed model may be writte as Y = μ + b + E (6) i i i where b N = U β ad i is s s= E N = U E, or i matrix form as i is s s= Y = Xμ + Zb+ E where = b U β, b = ( ), = ( ) b b b β β β β ad E = U E. This represets the N sample radom variables i the fiite populatio as defied by Staek ad Siger (004). The radom variable b i correspods to the deviatio of the subect s latet value from the populatio mea for the subect i positio i i a radomly selected sequece. Let the target correspod to a liear fuctio of P = Xμ + Zb give by T = g P. Whe g = e i, (a vector whose elemets are all equal to zero except the elemet i row i that is equal to oe), the target is P i. The correspodig FPMM-BLUP is a liear fuctio of Y that is ubiased ad has C09ed3v6v.doc /8/009 5:5 PM 6

17 miimum expected mea squared error (MSE). We show (see Appedix E for details) that the FPMM- BLUP is i ( i ) P = Y + k Y Y where Y Yi i = = is the sample average respose, k = γ γ + σ N σ s N s=, ad σ =. The expected MSE of the predictor is var ( P P ) = + k( ) pr i ad it follows that var ( P P) ( f ) pr σ σ γ = +.. Whe T = μ, g =, we have P = Y, Of particular iterest is a compariso of the average MSE for potetially realizable resposes. For all sample sets ad subects, the MM-BLUP MSE is smaller tha the MSE of the observed respose, ad smaller tha MSE of the FPMM-BLUP. For a give sample sequece, the FPMM-BLUP is biased, with the bias give by E ( R Pi Pi ) = ( k)( y μ ) sequece, the MSE of the FPMM-BLUP is give by (see Appedix F). Coditioal o a sample h ( ) ( ) σ ( ) ( σ ) ( ) ( μ ) h = h MSEpR Pi Pi k k k h k y. (7) Examples We cosider the FPMM-BLUP for simple radom samples of size = from the populatio of N = 3 subects listed i Table. First, ote that there are six possible sample sequeces, with! = sequeces for each sample set. Sice the FPMM-BLUP is idetical for a subect i differet sequeces i the same set, we list the t =,..., possible equally likely sample resposes i Table correspodig to the three sample sets. C09ed3v6v.doc /8/009 5:5 PM 7

18 - sert Table - Notice that the FPMM-BLUP is a biased predictor of each subect s latet value, but the average bias (over all subects) is zero. The MSE differs betwee subects, ad exceeds the MSE of the observed respose for Daisy ad Rose, but is smaller (48.58 vs 00) tha the MSE of the observed respose for Lily. The average MSE of the FPMM-BLUP (i.e., 3.66) over all subects is smaller tha the average MSE of the observed respose (i.e., 35). Table provides a summary of the MM-BLUP (whe = ) ad the FPMM-BLUP for the three differet sets of = from the populatio of N = 3 listed i Table. Recall that the MM-BLUP is defied for each set, while the FPMM-BLUP is defied over all possible sets. The results i Table are arraged i paels of rows correspodig to average predictors of Daisy s, Lily s, ad Rose s latet values. The colums correspod to the average predictor, the bias, ad the MSE. The bias ad MSE are evaluated for the MM relative to the MM-latet value, ad relative to the subect s true latet value. The potetially realizable resposes correspod to rows where t =,...,4. The last three rows i Table summarize the average results over potetially realizable resposes, over artificial resposes that are ot potetially realizable, ad over all resposes. The average bias over all resposes is zero for each predictor, but whe bias is calculated oly over potetially realizable resposes, the MM-BLUP is biased, while the FPMM-BLUP is ot. The results i Table illustrate overlappig but distict sample spaces that uderlie the MM ad the FPMM predictors. A compariso of the accuracy of the predictors should be based o the average MSE over a commo set of potetially observable resposes. This correspods to the MSE for the rows correspodig to t =,...,4 i Table. A Example with N = 4 ad = 3 C09ed3v6v.doc /8/009 5:5 PM 8

19 We cosider a slightly larger example to compare the MM-BLUP ad FPMM-BLUP. The example is for a populatio of N = 4 where a simple radom sample of = 3 subects is selected, resultig i four possible sample sets. The populatio cosists of the origial populatio give i Table, ad a additioal subect, Violet, with a latet value ad respose variace give by y 4 = adσ 4 = 5, respectively. The compariso is made for each sample set- assumig that the set costitutes a populatio for the FPMM ad = for the MM. This meas that the betwee subects variace, γ, is idetical for the FPMM ad the MM i each set, but both the betwee subects variace ad the average respose error variace, σ, are differet for differet sets. We compare the MSE of the estimates of subect s latet values from the MM-BLUP (4) ad the FPMM-BLUP (7) i Table 3. The results idicate that the MSE of the MM-BLUP is smaller tha that of the FPMM-BLUP i most, but ot all settigs. The FPMM-BLUP MSE is smaller for Rose i the set {Daisy, Rose, ad Violet} ad for Violet i the set {Daisy, Lily, ad Violet}. Discussio The compariso of the model-based formulatio of the mixed model () ad the fiite populatio mixed model (6) via the examples provides some isight to the iterpretatio of mixed models ad reveals the opportuity for cofusio i this cotext. First, the compariso provides some clarity to Robiso s (99) discussio of whether the MM-BLUP should be termed a estimator or a predictor, ad uderscores the difficulty that Hederso (975) had i providig a covicig iterpretatio of the MM- BLUP. Hederso (984, page 37) posed the problem as to Which is the more logical cocept, predictio of a radom variable or estimatio of the realized value of a radom variable? f we have a aimal already bor, it seems reasoable to describe C09ed3v6v.doc /8/009 5:5 PM 9

20 the evaluatio of its breedig value as a estimatio problem. O the other had, if we are iterested i evaluatio the potetial breedig value of a matig betwee two potetial parets, this would be a problem i predictio. The termiology of estimatio applies to the MM-BLUP whe the aimal is already bor, while predictio applies to the FPMM-BLUP whe the matig parets have yet to be selected. The iterpretatio of ubiased is also clarified. the mixed model, we ca distiguish Eξ R ( Y ) from E R ξ ( Y ) = P (the MM-latet value for subect ) from E ( Y ) ξ P= y = y, the true latet R value for subect. f our iterest is i the latet value for subect the ubiased property of the MM- BLUP is defied as E ( ξ R P ) ( P ) R = μ. This differs from the usual defiitio of ubiased, give by E ξ P= y = y. Neither the MM-BLUP or the FPMM-BLUP are ubiased whe this defiitio is adopted. The MM-BLUP is a biased estimator of the subect s latet value, while the FPMM-BLUP is a biased predictor of the realized radom effect. cludig U i the BLUP acroym may provide reassurace that BLUPs are OK for those who cosider lack of bias as a pre-requisite for aalysis. But truth would be better served if both MM-BLUPs ad FPMM-BLUPs were described as biased but more accurate ways of estimatig a subect s latet value. = μ A importat aspect of the parallel developmet of the MM ad FPMM is the descriptio of the overlappig but distict sample spaces. Sice the examples we cosidered are small ad the outcomes are discrete, it is possible to make the sample spaces explicit. More geerally, the sample space is the product of possible realizatios of P ad E. f respose error has m values for each subect, both the sample spaces for the MM ad for the FPMM whe = N = have m! equally likely sample poits. C09ed3v6v.doc /8/009 5:5 PM 0

21 These sample spaces share m sample poits where = P y. The additioal ( )! m values i the MM whe P y are artificial, while the (! ) m resposes i the FPMM correspod to differet permutatios of the subects that are all potetially realizable, but ot all observed. this cotext, the differece i the MM-BLUP ad FPMM-BLUP is due to their developmet over the differet sample spaces. We advocate evaluatig statistics over sample spaces that are potetially realizable. This guidelie requires statistics to be liked to reality, implyig that oly a portio of the sample space be used to evaluate estimators i the MM. t is cosistet with Tukey s commet i discussio of Nelder (977) that our focus must be o questios, ot models. By limitig evaluatio of the estimators from the two formulatios of the mixed model to the potetially realizable sample space, we keep the focus o real questios. With this focus, as illustrated via the examples, the MM-BLUP of a subect s latet value is ot uiformly more accurate tha the the FPMM-BLUP. More study i this area is clearly eeded. Guidelies are lackig for estimator choice; uderstadig is lackig o how to artificially expad a sample space to produce more accurate estimators; practical issues where variace parameters are ukow are yet to be explored; ad extesios to settigs with auxiliary variables are ot yet cosidered. The distictio betwee potetially realizable poits i the sample space ad artificial sample poits i the MM provides a cotext for uderstadig the cocer expressed i much of the classical statistical literature that oly variace compoets should be estimated ad radom effects should ot be predicted. First, otice that eve though the MM icludes artificial sample poits, γ ca be iterpreted as the variace of the subect s latet values i the set whe =, as the variace of the subect s latet values i the populatio whe N =, or as the variace of the subect s latet values i the superpopulatio whe N <. This provides legitimacy to estimatig variace compoets. The ratioale for cocer over predictig radom effects i a MM is also evidet, sice for a subect, there is a differece betwee C09ed3v6v.doc /8/009 5:5 PM

22 the MM-latet values, ad the subect s latet value. the MM, the latet value associated with a subect is ot costat, but chages for differet sample poits. There is o reaso to be iterested i the artificial latet values assiged to a subect. This reasoig provides the logic behid a statemet that predictio of radom effects has o meaig. Our uderstadig of this cocer chages if we cosider estimatio of realized radom effects, where the term realized implies limitig cosideratio to sample poits that are potetially realizable. By restrictig the sample space to such poits, the MM latet value is costat for a subect, ad equal to the subect s latet value. Estimatio of the realized radom effect i the MM is meaigful, as is predictio of the realized radom effect i the FPMM. There is a simple coectio betwee the MM ad Bayesia methods. The distributio of P i the MM has bee termed the obective prior distributio, as i Robiso (99). t has a simple iterpretatio as the distributio of subect s latet values, ad characterizes atural variatio betwee subects. Whe < N, defiig the distributio of P to be a subset of radom variables from a exchageable distributio of N = latet values i the populatio, will expad the umber of artificial respose i the MM sample space, but ot alter the umber of potetially realizable resposes. Although each realizatio of P is a set of latet values from the populatio, this expasio does ot make the estimator based o a set of subects from the MM more geeral, or does it guaratee that the resultig estimator will be more accurate. The accuracy of estimators that are developed from such models should be evaluated oly over potetially realizable poits i the sample space. Such a evaluatio may provide isight as to whether artificial expasio of sample spaces ca give rise to more accurate estimators. t is possible to expad the discussio of Bayesia cocepts to iclude a distributio of fixed effects, which Robiso (99) refers to as a subective prior. A sample poit i the resultig oit distributio C09ed3v6v.doc /8/009 5:5 PM

23 (of fixed ad radom effects) must have parameters equal to those for the actual set of subects i order for potetially realizable resposes to be icluded i the sample space. The extesio to subective priors exteds oly the umber of artificial poits i the sample space, ad does ot alter the set of potetially realizable resposes from which the resultig estimator should be evaluated. Still, it is possible that such a extesio of the artificial sample poits will produce a more accurate estimator i some settigs, a area deservig further study. There is a firm coectio betwee the MM ad the FPMM i the survey samplig literature datig back to the importat papers of Godambe (955) ad Godambe ad Joshi (965). Their work stimulated a crisis i the foudatios of statistical iferece, as summarized by Cassel et al. (977). We discuss this coectio, sice it provides a uifyig framework for ideas of statistical iferece. Godambe (955) cocluded that there is o best liear ubiased estimator of a fiite populatio total based o probability samples. This result was startlig sice the sample mea from a simple radom sample is commoly preseted as the BLUE of the populatio mea. mportat ideas i Godambe s developmet iclude the very geeral defiitio of a liear estimator, ad the eed of additioal assumptios beyod samplig to obtai a optimal estimator. The liear estimator iitially proposed by Godambe (955) icludes separate coefficiets for each subect i each positio i a sample, where sample poits correspod to realizatios of the subset of the first radom variables represetig a permutatio of subect values i a fiite populatio. Subsequetly, Godambe ad Joshi (965) cocluded that it was sufficiet for coefficiets to be defied for each subect i a sample set, ot a sample sequece. Optimal coefficiets ca icorporate subect specific iformatio, such as differet respose error variaces, sice subects are idetifiable i a set. Additioally, sice the sample set is the startig poit, the coectio back to the possible samplig probabilities is ot relevat, sice iferece is coditioal o the sample set. The settig cosidered by Godambe (955) did ot iclude respose error. Addig respose error to a subect s latet value i Godambe s basic model does ot alter the coclusio of o-existece of a C09ed3v6v.doc /8/009 5:5 PM 3

24 BLUE, eve though it is possible to specify a set of estimatig equatios for a target. While the equatios ca be solved, the solutio does ot result i a estimator sice it icludes o-sample latet values. Godambe (955) itroduced additioal a priori model assumptios (motivated by icludig a auxiliary variable) i order to develop a estimator of the populatio total. These assumptios are similar to the MM assumptios o latet values. As a result, the MM ca be cosidered to be a variatio o the suggestio by Godambe (955). These basic ideas costitute the foudatio for superpopulatio models i survey samplig. We idetify aspects of these models that are related to the MM. First, there is a coectio betwee the realized sample ad the superpopulatio, which we defie i terms of a set of latet values give by realizatios of P = μ + a. These latet values eed ot be simply the latet values for the subects i the sample set, but could be defied quite geerally. Whe the latet values for the subects i the sample set are icluded i this defiitio, it is always possible to cosider the realized sample as a possible sample poit i the superpopulatio model. Notice how this defiitio obscures the iterpretatio of a superpopulatio, sice the oly idetifiable subects are those i the sample. While it may be appealig to thik of a superpopulatio as a larger fiite populatio (as i Voss (999)), there is o eed to do so. The FPMM is the result of movig i a differet directio as a cosequece of Godambe s o-existece results. Rather tha icludig additioal assumptios i the model for a sample set, the FPMM collapses radom variables to a lower dimesioal space. Oe casualty i the collapsig is a loss i idetifiability of subects for the FPMM radom variables. This idetifiability is lost whe developig predictors of realized radom effects, but re-gaied oce the subects i the sample set are realized. order to maitai these distictios, Staek ad Siger (004) have described the FPMM-BLUP as a predictor of the latet value of a realized subect i a positio i a sample. This awkward iterpretatio is a stumblig block for the methods, sice the positio is ot of substative iterest i a practical problem, ad the C09ed3v6v.doc /8/009 5:5 PM 4

25 subect (whose latet value is of iterest) is ot idetifiable. By limitig evaluatio of the MSE to subects i the sample set, the FPMM-BLUP is a estimator of a subect s latet value i the set, offerig a simple clear iterpretatio. While the simple examples illustrate that the FPMM predictor may outperform the MM predictor i some settigs, guidace for its use is curretly lackig. Some of the mai ideas i these results are far reachig. First, we coclude that a importat area for ivestigatio is that of statistical iferece coditioal o the sample set. Secod, we coclude that it is crucial to evaluate properties of estimators over potetially realizable sample poits, ad ot iclude artificial poits i the sample space. This simple guidelie ca elimiate debate over adoptio of prior distributios, or other artificial assumptios, sice their use i developig estimators is allowed, but the evaluatio of the properties of the estimators should be tied to reality. Fially, these results illustrate that there is a lot to be leared. May accepted procedures, models, ad theories appear to be based o ideas that are ot cosistet with these two coclusios. Their re-examiatio may lead to a souder basis for statistical iferece i the future. C09ed3v6v.doc /8/009 5:5 PM 5

26 Appedix A. We develop the predictor of P = X μ +Z a where X = ad Z = e as a liear fuctio of Y, i.e. P = cy, that is ubiased ad has miimum expected MSE i the mixed model followig the developmet by Goldberger (96) as reviewed by Robiso (99). We first ote that X μ ad Y Eξ R = P X ( P P) Eξ = 0. Sice R var ξ R Y Ω ZΓZ =. The ubiased costrait requires that P Z ΓZ Z ΓZ E ( P P ) = E ξ cy ( X μ + Z a) ξr R = Eξ cp = cx μ X μ ( X Z ) μ, a the ubiased costrait is give by cx = 0. Miimizig X ( P P ) = c c czγz + Z ΓZ var R ξ Ω with respect to c subect to the ubiased costrait results i P ( = X μ + Z ΓZΩ Y Xμ ) where μ ( ) = XΩ X XΩ Y. = γ + σ γ J = Sice Ω ( ), Ω = k + γ = k kk, where k γ = γ + σ, ( ) k = k k k ad k X X= k = k. Usig this result, Ω γ ( k ) so that μ ky k =. Now ΓZ k k Ω = k + = ( k ) ad hece where ( k) ΓZ ( ) ( Ω Y X μ = k ) ( μ + μ) = Y X k k Y X ( μ ) = 0 k k Y X, so that ( k) C09ed3v6v.doc /8/009 5:5 PM 6

27 ( Y μ) ( μ ) = X μ + Z ΓZΩ Y X P = μ+ k. Appedix B. Mea Squared Error The mea squared error of P uder the model for Y is give by P P P = X μ + Z ΓZ Ω Y X μ ad P = X μ +Z a. Notice that we ca ( ) var R write ξ where ( ) ( k ) P = cy where c = k + e k. First, observe that k = E P = cp ( ) R ξ ( k) Pw kp Pw k( P Pw) = ( k ) k P + k P = + k = = + where Pw = k P. Now ( Y var ) ( ξr P P = varξr c Z ) where k = a Y Ω Γ Y varξ R = ad Ω= Γ + σ, varξ R = Γ + σ. As a a Γ Γ = 0 0 = a result, ( ) var ξ R P P = c σ + Γ c Z Γc + Z ΓZ. = Usig c ad Z = e, we expad these terms to obtai var These terms simplify as ( k ) ( k) ( ξ R P P) = σ + = k k kγk k k ( k ) ( k) + e k+ k e k k kσ kσ = = ( k ) ( k) k k = = + e Γk+ kγ e k k k σ + e e + e k Γ k e = = = ( k ) e Γk e Γ k e + eγe k =. C09ed3v6v.doc /8/009 5:5 PM 7

28 ad ( ) ( ) ( ) k k k k γ σ k k+ kγk =, k = k k ( k ) ( ) k e kσ k+ k kσ e + k = k =, ( k ) ( ) ( k) k k k γ k k + e Γk+ kγ e = k = k = k k( k) γ e kσ e + e k Γ k e =, = = = ( k ) + ( k) kγ e Γk e Γ k e + e Γe = γ kγ. k = k var k ξ R P P = σ k +. k Combiig all terms ad simplifyig the resultig expressio, ( ) Appedix C. The Coditioal MSE We evaluate E ( R P P P = ) P = y. As a result, ( ) k Sice c = k + e k, k = where result, w y. Now P = cy, E ( ) ( P P P = y) = ( cy y P= y) E E R R cy ( ) ( k ) = cy y = k y + k y = k k w k k = y + k y y = k = k = μ + k y k = ad μw = w y. Hece, k = E ( R P P = ) = ( k )( y μw) We evaluate E ( R P P) P = y by otig that give = ( ) P y. R Y P= y = y, ad P = y whe. P y, P = = ( + ) cy c P E. As a C09ed3v6v.doc /8/009 5:5 PM 8

29 E Now ( ) R = ( ( ) ) ( ) (( ) ) ( ce ( y cy )) P y E P P P = y = E c y + E y P= y EE = σ. As a result, R R ( ) = E = R ( ) ( y ) = c E EE c + c y R (( ) ) P P = = σ + ( y ) = E R P y c c cy. Now ( ) ( ) k k c σ k σ k = c = k + e + k e k = = k = ( k) ( k) = k σ k σ k + e k+ e k = = k = ( k ) ( ) k = k σ k σ k σ k+ e k+ e e k = k = = ( k ) ( k) = k σ + k σ + k σ cy = k μ + k y, Usig ( ) As a result, w ( ( ) P y) k = k ( cy ) ( ) y = y k μw ky ( k)( y μw) ( k) ( y μw) = = ( k ) ( k) E.. ( ) ( ) R P P = = k σ + kσ + kσ + k y μw k = k ( k ) = ( k) w σ + ( y μw) kσ + + = k.. Appedix D. Relatioship Betwee Models for Sequeces ad Sets C09ed3v6v.doc /8/009 5:5 PM 9

30 Respose for sequece h give by Yh ca be related to respose defied by the respose error model (). To see this, we represet the idicator variables that defie sequece h i u h as a permutatio (defied by v m ) of elemets i set h (defied by δ h ) such that u = δ v h h m where δh δh δh N δh δh δhn δ h =, δh δh δhn th δ hs has a value of oe if the smallest subect s label i set h is for subect s, ad zero otherwise, ad vm vm vm vm vm vm v m = vm vm vm th with elemets v mi havig a value of oe if the smallest label i set h is i positio i, ad zero otherwise. For example, whe = ad N = 3, the data for sequece h cosistig of subect s = 3 followed by s = is (( s 3, Y ) ( s, Y 0 = = h h ) ) is defied by usig v m = to permute the subects i set h defied by δ h =. 0 0 Appedix E. Z i = The FPMM-BLUP of a sample subect s latet value, Pi = Xiμ + Zb i where X i = ad e may be obtaied similarly to the developmet i Appedix A. We first ote that i γ Y X Γ = J ad Ω= Γ+σ so that E pr = μ ad N P i X i Y Ω ZΓZi varpr =. The predictor is a liear fuctio of Y give by P = cy such that P i ZiΓZ ZiΓZi E P P = 0 which implies that cx = 0. The FPMM-BLUP is give by ( ) pr i + Z ΓZ Ω ( Y X ) where μ ( ) P = X μ μ i i i X i = X Ω X X Ω Y. Sice k Ω = + J, X Ω X = where f =, γ + σ N k γ + ( f ) σ N X Ω Y = Y, ( ) μ = X Ω X X Ω Y γ + σ fk = Y ad C09ed3v6v.doc /8/009 5:5 PM 30

31 k Z iγz Ω = k e i + N k. Sice Y X μ = J Y, ( ) Z iγz Ω Y Xμ = ke i J Y so that Pi = Y + k( Y i Y). Usig these expressios, P i k = + e i J Y where c i = + ke i J. Y Now varpr = Γ + σ. As a result, b 0 0 var ( ) ( ) pr Pi Pi = c i σ + Γ c i ZΓc i i + ZΓZ i i. Now ( ) c ( ) i σ + Γ ci = c i σ + γ γ J ci N, = ( σ + γ ) cc i i γ cj i ci N ad cc i i= + k while cj i ci = resultig i ( ) c i σ + Γ c i = ( σ + γ ) + k γ. N Also, Z Γc i i = γ + kγ N ad Z iγz i = γ. Combiig these terms ad N simplifyig, ( σ varpr Pi Pi ) = + k( ). Appedix F. The MSE of the FPMM-BLUP for a Sample Set The FPMM-BLUP of a sample subect s latet value, Pi = Xiμ + Zb i where X i = ad Zi = e i is i ( i ) P = Y + k Y Y. The predictor is developed over all possible sample sequeces idetified by u = δ v h h m where a realized sample sequece is the realizatio of u h h. We evaluate ( P P ) var pr i i h. Sice Y h = m h H h = v δ Y ad P = v δ y, h m h h P ( i Pi δ Y h = cm gm ) δhy C09ed3v6v.doc /8/009 5:5 PM 3

32 where we defie c m = c iv m ad m = m Now ( ) e =c δ Y δ c. g gv. As a result, var ( P P ) var ( ) pr i i h m h R h m δ h varr Y δ h = σ ad c m = + ke i v m J. Notice that ev i m = e with elemets = = ev so that we ca express m k c = + e J. As a result, defiig σ h = σ i=, = i mi ( P P h ) pr i i = c m σ cm = var Also, settig μ = h h y, the MSE is give by = σ h + k ( σ σh ) + k σ σ + σh. = ( ( ) k+ kσ ( ) ) + k σh ( ) ( c g h ) h R i i = m m R yh E P P E Y ( k)( y μ ) = ( ) ( ( ) ( ) ) ( h = + σ + σ + ) ( μh ) MSEpR Pi Pi k k k h k y = + + +, ( ) k kσ ( k) ( σ ) ( k) ( y μ ) h h. C09ed3v6v.doc /8/009 5:5 PM 3

33 Refereces Brow, E.M., ad Kass, R.E. (009). What is statistics? (with discussio), The America Statisticia, 63: Cassel, C.M., Särdal, C.E. ad Wretma, J.H. (977), Foudatios of ferece i Survey Samplig, New York, NY: Joh Wiley. Godambe, V.P. (955). A uified theory of samplig from fiite populatios. Joural of the Royal Statistical Society B. 7: Godambe, V.P. ad Joshi, V.M. (965). Admissibility ad Bayes estimatio i samplig from fiite populatio.. The Aals of Mathematical Statistics Hederso, C.R. (975). Best liear ubiased estimatio ad predictio uder a selectio model, Biometrics 3: Hederso, C.R. (984). Applicatios of Liear Models i Aimal Breedig. Uiversity of Guelph, Guelph Caada (SBN ). Koch, G.G. (967). A procedure to estimate the populatio mea i radom effects models, Techometrics 9: Koch, G. G. (973). A alterative approach to multivariate respose error models for sample survey data with applicatios to estimators ivolvig subclass meas, Joural of the America Statistical Associatio, 68: Koch, G. G., Gilligs, D.B., ad Stokes, M.E. (980). Biostatistical implicatios of desig, samplig, ad measuremet to health sciece datat aalysis, A. Rev. Public Health :63-5. Merriam, P.A., Ockee,.S., Hebert, J.R., Milagros, C.R., ad Matthews,C.E. (999). "Seasoal variatio of blood cholesterol levels: study methodology," Joural of Biological Rhythms. Vol. 4 No. 4, Nelder, J.A. (977). A reformulatio of liear models w(with discussio). Joural of the Royal Statistical Society A C09ed3v6v.doc /8/009 5:5 PM 33

34 Ockee,.S., Chiriboga, D.E., Staek, E.J., Harmatz, M.G., Nicolosi, R., Saperia, G., Well, A.D., Merriam, P.A., Reed, G., Ma, Y., Matthews, C.E. ad Hebert, J.R. (004). Seasoal variatio i serum cholesterol: Treatmet implicatios ad possible mechaisms. Archives of teral Medicie, 64: Robiso, G.K. (99). That BLUP is a good thig: the estimatio of radom effects, Statistical Sciece, 6:5-5. Staek E.J. ad Siger, J.M. (004), Predictig Radom Effects from Fiite Populatio Clustered Samples with Respose Error, Joural of the America Statistical Associatio, 99: Voss, D.T. (999). Resolvig the Mixed Models Cotroversy. The America Statisticia C09ed3v6v.doc /8/009 5:5 PM 34

35 List of Table Titles Table. Populatio Values ad Parameters for Simple Example Table. Potetially Realizable Resposes for the Set {Daisy, Rose} Assumig a Mixed Model Source: c09ed33.xls Table 3. Additioal (o-realizable) Resposes for the Set {Daisy, Rose} Assumig a Mixed Model Table 4. Predictors of MM-Latet Values, Differece from P, ad MSE for the Set {Daisy, Rose}. Table 5. Predictors of Subect s Latet Values, Differece from y, ad MSE for the Set {Daisy, Rose}. Table 6. Estimators of the Mea Latet Value P = y from a Respose Error Model ad MM, the Differece from the Mea, ad the MSE for the Set {Daisy, Rose}. Table 7. Potetially Realizable Resposes for the Set {Daisy, Rose, Lily} Assumig a Respose Error Model Table 8. Predictors of Daisy s Latet Value, Differece from MM Latet Value P, ad MSE for the Set {Daisy, Rose, Lily} for Potetially Realizable Resposes. Table 9. Predictors of Daisy s Latet Value, Differece from MM Latet Value P, ad MSE for the Set {Daisy, Rose, Lily} for No-realizable Resposes. Table 0. Summary of MM-Average Predictor, P, MM-Latet Value, P, Differece, ad MSE for Resposes t =,...,8 ; t = 9,...,48 ; ad t =,..., 48 for the set {Daisy, Rose, Lily} Table. Fiite Populatio Mixed Model Respose ad Predictors of Realized Latet Values Table. Compariso of Average, Bias ad MSE of Predictors of Subect s Latet Values Usig the Subect s Respose Error Model (RE), Y, the MM-BLUP, P, ad the FPMM-BLUP, P i. Table 3. Compariso of the MSE betwee the MM-BLUP ad the FPMM-BLUP for Potetially Realizable Sample sets of size = 3 from a populatio of N = 4. List of Figure Titles Figure. Mea vs Stadard Deviatio of Saturated Fat itake (gm/day) for =554 subects with 0 or mores 4hr recall measures i Seaso s Study. C09ed3v6v.doc /8/009 5:5 PM 35

Sampling, WLS, and Mixed Models Festschrift to Honor Professor Gary Koch

Sampling, WLS, and Mixed Models Festschrift to Honor Professor Gary Koch Samplig, WLS, ad Mixed Models Festschrift to Hoor Professor Gary Koch Edward J. Staek III Departmet of Public Health Uiversity of Massachusetts, Amherst, MA ad Julio M Siger Departameto de Estatística

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Basics of Probability Theory (for Theory of Computation courses)

Basics of Probability Theory (for Theory of Computation courses) Basics of Probability Theory (for Theory of Computatio courses) Oded Goldreich Departmet of Computer Sciece Weizma Istitute of Sciece Rehovot, Israel. oded.goldreich@weizma.ac.il November 24, 2008 Preface.

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable.

It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable. Chapter 10 Variace Estimatio 10.1 Itroductio Variace estimatio is a importat practical problem i survey samplig. Variace estimates are used i two purposes. Oe is the aalytic purpose such as costructig

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

(3) If you replace row i of A by its sum with a multiple of another row, then the determinant is unchanged! Expand across the i th row:

(3) If you replace row i of A by its sum with a multiple of another row, then the determinant is unchanged! Expand across the i th row: Math 5-4 Tue Feb 4 Cotiue with sectio 36 Determiats The effective way to compute determiats for larger-sized matrices without lots of zeroes is to ot use the defiitio, but rather to use the followig facts,

More information

Optimal Estimator for a Sample Set with Response Error. Ed Stanek

Optimal Estimator for a Sample Set with Response Error. Ed Stanek Optial Estiator for a Saple Set wit Respose Error Ed Staek Itroductio We develop a optial estiator siilar to te FP estiator wit respose error tat was cosidered i c08ed63doc Te first 6 pages of tis docuet

More information

Commutativity in Permutation Groups

Commutativity in Permutation Groups Commutativity i Permutatio Groups Richard Wito, PhD Abstract I the group Sym(S) of permutatios o a oempty set S, fixed poits ad trasiet poits are defied Prelimiary results o fixed ad trasiet poits are

More information

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1.

Econ 325/327 Notes on Sample Mean, Sample Proportion, Central Limit Theorem, Chi-square Distribution, Student s t distribution 1. Eco 325/327 Notes o Sample Mea, Sample Proportio, Cetral Limit Theorem, Chi-square Distributio, Studet s t distributio 1 Sample Mea By Hiro Kasahara We cosider a radom sample from a populatio. Defiitio

More information

Element sampling: Part 2

Element sampling: Part 2 Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig

More information

CS284A: Representations and Algorithms in Molecular Biology

CS284A: Representations and Algorithms in Molecular Biology CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Estimation of the Population Mean in Presence of Non-Response

Estimation of the Population Mean in Presence of Non-Response Commuicatios of the Korea Statistical Society 0, Vol. 8, No. 4, 537 548 DOI: 0.535/CKSS.0.8.4.537 Estimatio of the Populatio Mea i Presece of No-Respose Suil Kumar,a, Sadeep Bhougal b a Departmet of Statistics,

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

Hoggatt and King [lo] defined a complete sequence of natural numbers

Hoggatt and King [lo] defined a complete sequence of natural numbers REPRESENTATIONS OF N AS A SUM OF DISTINCT ELEMENTS FROM SPECIAL SEQUENCES DAVID A. KLARNER, Uiversity of Alberta, Edmoto, Caada 1. INTRODUCTION Let a, I deote a sequece of atural umbers which satisfies

More information

Estimation of Gumbel Parameters under Ranked Set Sampling

Estimation of Gumbel Parameters under Ranked Set Sampling Joural of Moder Applied Statistical Methods Volume 13 Issue 2 Article 11-2014 Estimatio of Gumbel Parameters uder Raked Set Samplig Omar M. Yousef Al Balqa' Applied Uiversity, Zarqa, Jorda, abuyaza_o@yahoo.com

More information

Chapter 9 - CD companion 1. A Generic Implementation; The Common-Merge Amplifier. 1 τ is. ω ch. τ io

Chapter 9 - CD companion 1. A Generic Implementation; The Common-Merge Amplifier. 1 τ is. ω ch. τ io Chapter 9 - CD compaio CHAPTER NINE CD-9.2 CD-9.2. Stages With Voltage ad Curret Gai A Geeric Implemetatio; The Commo-Merge Amplifier The advaced method preseted i the text for approximatig cutoff frequecies

More information

Principle Of Superposition

Principle Of Superposition ecture 5: PREIMINRY CONCEP O RUCUR NYI Priciple Of uperpositio Mathematically, the priciple of superpositio is stated as ( a ) G( a ) G( ) G a a or for a liear structural system, the respose at a give

More information

Summary and Discussion on Simultaneous Analysis of Lasso and Dantzig Selector

Summary and Discussion on Simultaneous Analysis of Lasso and Dantzig Selector Summary ad Discussio o Simultaeous Aalysis of Lasso ad Datzig Selector STAT732, Sprig 28 Duzhe Wag May 4, 28 Abstract This is a discussio o the work i Bickel, Ritov ad Tsybakov (29). We begi with a short

More information

Lecture Overview. 2 Permutations and Combinations. n(n 1) (n (k 1)) = n(n 1) (n k + 1) =

Lecture Overview. 2 Permutations and Combinations. n(n 1) (n (k 1)) = n(n 1) (n k + 1) = COMPSCI 230: Discrete Mathematics for Computer Sciece April 8, 2019 Lecturer: Debmalya Paigrahi Lecture 22 Scribe: Kevi Su 1 Overview I this lecture, we begi studyig the fudametals of coutig discrete objects.

More information

Abstract. Ranked set sampling, auxiliary variable, variance.

Abstract. Ranked set sampling, auxiliary variable, variance. Hacettepe Joural of Mathematics ad Statistics Volume (), 1 A class of Hartley-Ross type Ubiased estimators for Populatio Mea usig Raked Set Samplig Lakhkar Kha ad Javid Shabbir Abstract I this paper, we

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

Sequences. Notation. Convergence of a Sequence

Sequences. Notation. Convergence of a Sequence Sequeces A sequece is essetially just a list. Defiitio (Sequece of Real Numbers). A sequece of real umbers is a fuctio Z (, ) R for some real umber. Do t let the descriptio of the domai cofuse you; it

More information

x a x a Lecture 2 Series (See Chapter 1 in Boas)

x a x a Lecture 2 Series (See Chapter 1 in Boas) Lecture Series (See Chapter i Boas) A basic ad very powerful (if pedestria, recall we are lazy AD smart) way to solve ay differetial (or itegral) equatio is via a series expasio of the correspodig solutio

More information

Discrete-Time Systems, LTI Systems, and Discrete-Time Convolution

Discrete-Time Systems, LTI Systems, and Discrete-Time Convolution EEL5: Discrete-Time Sigals ad Systems. Itroductio I this set of otes, we begi our mathematical treatmet of discrete-time s. As show i Figure, a discrete-time operates or trasforms some iput sequece x [

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors

ECONOMETRIC THEORY. MODULE XIII Lecture - 34 Asymptotic Theory and Stochastic Regressors ECONOMETRIC THEORY MODULE XIII Lecture - 34 Asymptotic Theory ad Stochastic Regressors Dr. Shalabh Departmet of Mathematics ad Statistics Idia Istitute of Techology Kapur Asymptotic theory The asymptotic

More information

This is an introductory course in Analysis of Variance and Design of Experiments.

This is an introductory course in Analysis of Variance and Design of Experiments. 1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

4.3 Growth Rates of Solutions to Recurrences

4.3 Growth Rates of Solutions to Recurrences 4.3. GROWTH RATES OF SOLUTIONS TO RECURRENCES 81 4.3 Growth Rates of Solutios to Recurreces 4.3.1 Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer.

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

Math 155 (Lecture 3)

Math 155 (Lecture 3) Math 55 (Lecture 3) September 8, I this lecture, we ll cosider the aswer to oe of the most basic coutig problems i combiatorics Questio How may ways are there to choose a -elemet subset of the set {,,,

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

6 Sample Size Calculations

6 Sample Size Calculations 6 Sample Size Calculatios Oe of the major resposibilities of a cliical trial statisticia is to aid the ivestigators i determiig the sample size required to coduct a study The most commo procedure for determiig

More information

ESTIMATION AND PREDICTION BASED ON K-RECORD VALUES FROM NORMAL DISTRIBUTION

ESTIMATION AND PREDICTION BASED ON K-RECORD VALUES FROM NORMAL DISTRIBUTION STATISTICA, ao LXXIII,. 4, 013 ESTIMATION AND PREDICTION BASED ON K-RECORD VALUES FROM NORMAL DISTRIBUTION Maoj Chacko Departmet of Statistics, Uiversity of Kerala, Trivadrum- 695581, Kerala, Idia M. Shy

More information

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A.

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A. Radom Walks o Discrete ad Cotiuous Circles by Jeffrey S. Rosethal School of Mathematics, Uiversity of Miesota, Mieapolis, MN, U.S.A. 55455 (Appeared i Joural of Applied Probability 30 (1993), 780 789.)

More information

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008 Chapter 6 Part 5 Cofidece Itervals t distributio chi square distributio October 23, 2008 The will be o help sessio o Moday, October 27. Goal: To clearly uderstad the lik betwee probability ad cofidece

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

Axioms of Measure Theory

Axioms of Measure Theory MATH 532 Axioms of Measure Theory Dr. Neal, WKU I. The Space Throughout the course, we shall let X deote a geeric o-empty set. I geeral, we shall ot assume that ay algebraic structure exists o X so that

More information

Introductory statistics

Introductory statistics CM9S: Machie Learig for Bioiformatics Lecture - 03/3/06 Itroductory statistics Lecturer: Sriram Sakararama Scribe: Sriram Sakararama We will provide a overview of statistical iferece focussig o the key

More information

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution Iteratioal Mathematical Forum, Vol., 3, o. 3, 3-53 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/.9/imf.3.335 Double Stage Shrikage Estimator of Two Parameters Geeralized Expoetial Distributio Alaa M.

More information

Understanding Samples

Understanding Samples 1 Will Moroe CS 109 Samplig ad Bootstrappig Lecture Notes #17 August 2, 2017 Based o a hadout by Chris Piech I this chapter we are goig to talk about statistics calculated o samples from a populatio. We

More information

Proof of Goldbach s Conjecture. Reza Javaherdashti

Proof of Goldbach s Conjecture. Reza Javaherdashti Proof of Goldbach s Cojecture Reza Javaherdashti farzijavaherdashti@gmail.com Abstract After certai subsets of Natural umbers called Rage ad Row are defied, we assume (1) there is a fuctio that ca produce

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

CEU Department of Economics Econometrics 1, Problem Set 1 - Solutions

CEU Department of Economics Econometrics 1, Problem Set 1 - Solutions CEU Departmet of Ecoomics Ecoometrics, Problem Set - Solutios Part A. Exogeeity - edogeeity The liear coditioal expectatio (CE) model has the followig form: We would like to estimate the effect of some

More information

Some examples of vector spaces

Some examples of vector spaces Roberto s Notes o Liear Algebra Chapter 11: Vector spaces Sectio 2 Some examples of vector spaces What you eed to kow already: The te axioms eeded to idetify a vector space. What you ca lear here: Some

More information

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random

10. Comparative Tests among Spatial Regression Models. Here we revisit the example in Section 8.1 of estimating the mean of a normal random Part III. Areal Data Aalysis 0. Comparative Tests amog Spatial Regressio Models While the otio of relative likelihood values for differet models is somewhat difficult to iterpret directly (as metioed above),

More information

Probability, Expectation Value and Uncertainty

Probability, Expectation Value and Uncertainty Chapter 1 Probability, Expectatio Value ad Ucertaity We have see that the physically observable properties of a quatum system are represeted by Hermitea operators (also referred to as observables ) such

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample.

Statistical Inference (Chapter 10) Statistical inference = learn about a population based on the information provided by a sample. Statistical Iferece (Chapter 10) Statistical iferece = lear about a populatio based o the iformatio provided by a sample. Populatio: The set of all values of a radom variable X of iterest. Characterized

More information

Modified Ratio Estimators Using Known Median and Co-Efficent of Kurtosis

Modified Ratio Estimators Using Known Median and Co-Efficent of Kurtosis America Joural of Mathematics ad Statistics 01, (4): 95-100 DOI: 10.593/j.ajms.01004.05 Modified Ratio s Usig Kow Media ad Co-Efficet of Kurtosis J.Subramai *, G.Kumarapadiya Departmet of Statistics, Podicherry

More information

Algebra of Least Squares

Algebra of Least Squares October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal

More information

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function.

It is always the case that unions, intersections, complements, and set differences are preserved by the inverse image of a function. MATH 532 Measurable Fuctios Dr. Neal, WKU Throughout, let ( X, F, µ) be a measure space ad let (!, F, P ) deote the special case of a probability space. We shall ow begi to study real-valued fuctios defied

More information

Estimation of Population Mean Using Co-Efficient of Variation and Median of an Auxiliary Variable

Estimation of Population Mean Using Co-Efficient of Variation and Median of an Auxiliary Variable Iteratioal Joural of Probability ad Statistics 01, 1(4: 111-118 DOI: 10.593/j.ijps.010104.04 Estimatio of Populatio Mea Usig Co-Efficiet of Variatio ad Media of a Auxiliary Variable J. Subramai *, G. Kumarapadiya

More information

1 Hash tables. 1.1 Implementation

1 Hash tables. 1.1 Implementation Lecture 8 Hash Tables, Uiversal Hash Fuctios, Balls ad Bis Scribes: Luke Johsto, Moses Charikar, G. Valiat Date: Oct 18, 2017 Adapted From Virgiia Williams lecture otes 1 Hash tables A hash table is a

More information

CHAPTER 10 INFINITE SEQUENCES AND SERIES

CHAPTER 10 INFINITE SEQUENCES AND SERIES CHAPTER 10 INFINITE SEQUENCES AND SERIES 10.1 Sequeces 10.2 Ifiite Series 10.3 The Itegral Tests 10.4 Compariso Tests 10.5 The Ratio ad Root Tests 10.6 Alteratig Series: Absolute ad Coditioal Covergece

More information

Sequences, Mathematical Induction, and Recursion. CSE 2353 Discrete Computational Structures Spring 2018

Sequences, Mathematical Induction, and Recursion. CSE 2353 Discrete Computational Structures Spring 2018 CSE 353 Discrete Computatioal Structures Sprig 08 Sequeces, Mathematical Iductio, ad Recursio (Chapter 5, Epp) Note: some course slides adopted from publisher-provided material Overview May mathematical

More information

Mathematical Induction

Mathematical Induction Mathematical Iductio Itroductio Mathematical iductio, or just iductio, is a proof techique. Suppose that for every atural umber, P() is a statemet. We wish to show that all statemets P() are true. I a

More information

6.867 Machine learning, lecture 7 (Jaakkola) 1

6.867 Machine learning, lecture 7 (Jaakkola) 1 6.867 Machie learig, lecture 7 (Jaakkola) 1 Lecture topics: Kerel form of liear regressio Kerels, examples, costructio, properties Liear regressio ad kerels Cosider a slightly simpler model where we omit

More information

Topic 5: Basics of Probability

Topic 5: Basics of Probability Topic 5: Jue 1, 2011 1 Itroductio Mathematical structures lie Euclidea geometry or algebraic fields are defied by a set of axioms. Mathematical reality is the developed through the itroductio of cocepts

More information

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10

DS 100: Principles and Techniques of Data Science Date: April 13, Discussion #10 DS 00: Priciples ad Techiques of Data Sciece Date: April 3, 208 Name: Hypothesis Testig Discussio #0. Defie these terms below as they relate to hypothesis testig. a) Data Geeratio Model: Solutio: A set

More information

1.010 Uncertainty in Engineering Fall 2008

1.010 Uncertainty in Engineering Fall 2008 MIT OpeCourseWare http://ocw.mit.edu.00 Ucertaity i Egieerig Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu.terms. .00 - Brief Notes # 9 Poit ad Iterval

More information

Lecture 12: September 27

Lecture 12: September 27 36-705: Itermediate Statistics Fall 207 Lecturer: Siva Balakrisha Lecture 2: September 27 Today we will discuss sufficiecy i more detail ad the begi to discuss some geeral strategies for costructig estimators.

More information

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4.

Definition 4.2. (a) A sequence {x n } in a Banach space X is a basis for X if. unique scalars a n (x) such that x = n. a n (x) x n. (4. 4. BASES I BAACH SPACES 39 4. BASES I BAACH SPACES Sice a Baach space X is a vector space, it must possess a Hamel, or vector space, basis, i.e., a subset {x γ } γ Γ whose fiite liear spa is all of X ad

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Statisticians use the word population to refer the total number of (potential) observations under consideration

Statisticians use the word population to refer the total number of (potential) observations under consideration 6 Samplig Distributios Statisticias use the word populatio to refer the total umber of (potetial) observatios uder cosideratio The populatio is just the set of all possible outcomes i our sample space

More information

The Random Walk For Dummies

The Random Walk For Dummies The Radom Walk For Dummies Richard A Mote Abstract We look at the priciples goverig the oe-dimesioal discrete radom walk First we review five basic cocepts of probability theory The we cosider the Beroulli

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ STATISTICAL INFERENCE INTRODUCTION Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I oesample testig, we essetially

More information

Dirichlet s Theorem on Arithmetic Progressions

Dirichlet s Theorem on Arithmetic Progressions Dirichlet s Theorem o Arithmetic Progressios Athoy Várilly Harvard Uiversity, Cambridge, MA 0238 Itroductio Dirichlet s theorem o arithmetic progressios is a gem of umber theory. A great part of its beauty

More information

GUIDELINES ON REPRESENTATIVE SAMPLING

GUIDELINES ON REPRESENTATIVE SAMPLING DRUGS WORKING GROUP VALIDATION OF THE GUIDELINES ON REPRESENTATIVE SAMPLING DOCUMENT TYPE : REF. CODE: ISSUE NO: ISSUE DATE: VALIDATION REPORT DWG-SGL-001 002 08 DECEMBER 2012 Ref code: DWG-SGL-001 Issue

More information

Improved Class of Ratio -Cum- Product Estimators of Finite Population Mean in two Phase Sampling

Improved Class of Ratio -Cum- Product Estimators of Finite Population Mean in two Phase Sampling Global Joural of Sciece Frotier Research: F Mathematics ad Decisio Scieces Volume 4 Issue 2 Versio.0 Year 204 Type : Double Blid Peer Reviewed Iteratioal Research Joural Publisher: Global Jourals Ic. (USA

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

Frequentist Inference

Frequentist Inference Frequetist Iferece The topics of the ext three sectios are useful applicatios of the Cetral Limit Theorem. Without kowig aythig about the uderlyig distributio of a sequece of radom variables {X i }, for

More information

( ) = p and P( i = b) = q.

( ) = p and P( i = b) = q. MATH 540 Radom Walks Part 1 A radom walk X is special stochastic process that measures the height (or value) of a particle that radomly moves upward or dowward certai fixed amouts o each uit icremet of

More information

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio

More information

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence

A sequence of numbers is a function whose domain is the positive integers. We can see that the sequence Sequeces A sequece of umbers is a fuctio whose domai is the positive itegers. We ca see that the sequece,, 2, 2, 3, 3,... is a fuctio from the positive itegers whe we write the first sequece elemet as

More information

G. R. Pasha Department of Statistics Bahauddin Zakariya University Multan, Pakistan

G. R. Pasha Department of Statistics Bahauddin Zakariya University Multan, Pakistan Deviatio of the Variaces of Classical Estimators ad Negative Iteger Momet Estimator from Miimum Variace Boud with Referece to Maxwell Distributio G. R. Pasha Departmet of Statistics Bahauddi Zakariya Uiversity

More information

Math 113 Exam 3 Practice

Math 113 Exam 3 Practice Math Exam Practice Exam will cover.-.9. This sheet has three sectios. The first sectio will remid you about techiques ad formulas that you should kow. The secod gives a umber of practice questios for you

More information

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca

More information

Topic 9: Sampling Distributions of Estimators

Topic 9: Sampling Distributions of Estimators Topic 9: Samplig Distributios of Estimators Course 003, 2016 Page 0 Samplig distributios of estimators Sice our estimators are statistics (particular fuctios of radom variables), their distributio ca be

More information

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014.

Product measures, Tonelli s and Fubini s theorems For use in MAT3400/4400, autumn 2014 Nadia S. Larsen. Version of 13 October 2014. Product measures, Toelli s ad Fubii s theorems For use i MAT3400/4400, autum 2014 Nadia S. Larse Versio of 13 October 2014. 1. Costructio of the product measure The purpose of these otes is to preset the

More information

MOMENT-METHOD ESTIMATION BASED ON CENSORED SAMPLE

MOMENT-METHOD ESTIMATION BASED ON CENSORED SAMPLE Vol. 8 o. Joural of Systems Sciece ad Complexity Apr., 5 MOMET-METHOD ESTIMATIO BASED O CESORED SAMPLE I Zhogxi Departmet of Mathematics, East Chia Uiversity of Sciece ad Techology, Shaghai 37, Chia. Email:

More information

A Relationship Between the One-Way MANOVA Test Statistic and the Hotelling Lawley Trace Test Statistic

A Relationship Between the One-Way MANOVA Test Statistic and the Hotelling Lawley Trace Test Statistic http://ijspccseetorg Iteratioal Joural of Statistics ad Probability Vol 7, No 6; 2018 A Relatioship Betwee the Oe-Way MANOVA Test Statistic ad the Hotellig Lawley Trace Test Statistic Hasthika S Rupasighe

More information

Problem Set 4 Due Oct, 12

Problem Set 4 Due Oct, 12 EE226: Radom Processes i Systems Lecturer: Jea C. Walrad Problem Set 4 Due Oct, 12 Fall 06 GSI: Assae Gueye This problem set essetially reviews detectio theory ad hypothesis testig ad some basic otios

More information

SEQUENCES AND SERIES

SEQUENCES AND SERIES 9 SEQUENCES AND SERIES INTRODUCTION Sequeces have may importat applicatios i several spheres of huma activities Whe a collectio of objects is arraged i a defiite order such that it has a idetified first

More information