Sampling, WLS, and Mixed Models Festschrift to Honor Professor Gary Koch

Size: px
Start display at page:

Download "Sampling, WLS, and Mixed Models Festschrift to Honor Professor Gary Koch"

Transcription

1 Samplig, WLS, ad Mixed Models Festschrift to Hoor Professor Gary Koch Edward J. Staek III Departmet of Public Health Uiversity of Massachusetts, Amherst, MA ad Julio M Siger Departameto de Estatística Uiversidade de São Paulo, Brazil Abstract Mixed models may be defied with or without referece to samplig or samplig radom variables, ad ca be used to predict realized radom effects. A commo applicatio ivolves the estimatio of latet values of study subects measured with respose error. I this cotext, mixed models may be specified as a sum of two radom variables, with oe stemmig from a exchageable distributio of latet values of study subects ad the other from the study subects respose error distributios. Such models assig positive probabilities to both potetially realizable resposes ad to artificial resposes that are ot potetially realizable. This has a impact o the defiitio of the parameters associated with study subects, o the iterpretatio of bias ad o the evaluatio of predictors. I cotrast, fiite populatio mixed models may be defied to represet the two-stage process of samplig subects ad measurig their resposes. Such models assig positive probabilities oly to potetially realizable resposes. We cosider the problem of estimatig a subect s latet value measured with respose error ad compare the two mixed model formulatios via a simple example. A aalysis of the performace of the correspodig predictors over the same potetially realizable resposes idicates that the optimal liear mixed model predictor (the usual BLUP) is ofte (but ot always) more accurate tha the comparable fiite populatio mixed model BLUP. The example provides the basis for a broader discussio of other liear estimators such as weighted least squares, ad the role of coditioig, samplig, ad model assumptios i developig iferece. C09ed3v.doc 0/7/009 5:40 PM

2 Itroductio: Advaces i public health ad health sciece are tied to uderstadig practical implicatios of chages i policy, programs, uderlyig cause of disease, prevetio, ad/or treatmet (Koch et al. (980)). Uderstadig the impact of such chages is the focus of much of Biostatistics. Not oly does Biostatistics embrace the theoretical uderpiigs of statistical modelig, but it seeks to tie the results of studies to actual reality. It is for this reaso that samplig plays a importat role i may applicatios of Biostatistics, sice estimates are eeded for real populatios. This is ot a simple process, sice it ivolves recocilig seemigly ad hoc approaches, such as i Koch s (967) procedure to estimate the populatio mea, with the fudametal basis of iferece from survey samplig, whe exteded, for example, to respose error (Koch, 973), ad to model based approaches. It is the struggle to uderstadig the basic uderpiigs of Biostatistics that has bee the focus of much of Koch s work, ad cotiues to be of compellig iterest, as discussed by Brow ad Kass (009). We discuss a simple settig that we feel challeges the depths of uderstadig of statistical iferece, estimatig the latet value of a subect. There are may settigs where iterest lies i the latet value for study subects. A example is the Seasos Study Merriam et al (999), Ockee et al. (004), where three 4-hour recall dietary iterviews were collected o each study subect i each seaso of a year to evaluate seasoal cholesterol chages, cotrollig for the cotributio of saturated fat itake. The 4-hour recalls were used to estimate the average saturated fat itake for each subect (the latet value) i the six weeks prior to cholesterol measure. Average saturated fat itake, ad the estimated stadard deviatio for 554 study subects for the first seaso i the study are displayed i Figure. Both the latet value ad the variace i saturated fat itake vary amog subects. Rather tha usig the simple average for the seaso to estimate a subect s latet saturated fat itake, a more accurate estimate may be obtaied by usig a best liear ubiased predictor (BLUP) i a mixed model (MM) with subects as radom effects. Although mixed model BLUPs are commoly used to estimate realized subects latet values, a close examiatio reveals that some of the a portio of the MM sample space is artificial ad ot potetially realizable. This prompts a reexamiatio of the iterpretatio of latet values, bias of the MM-BLUP, ad the criteria used to evaluate its performace. This also provides a cotext for compariso with the fiite populatio mixed model (FPMM) cosidered by Staek ad Siger (004), which icludes samplig ad avoids couterfactuals that are ot potetially realizable. We discuss these issues i the cotext of a simple problem. First, we develop a mixed model for a set of subects whose resposes follow a simple respose error model by addig the assumptio that the subect latet values are radom effects. This itroduces defiitios ad otatio. Next, we discuss a simple example to distiguish (possibly artificial) MM-latet values from (actual) subect latet values, ad MM-resposes from potetially realizable resposes. We follow by assumig that the set is selected from a fiite populatio of size N=3 ad review the FPMM alog with the correspodig BLUP i this cotext. We coclude with a discussio of the coectios betwee the issues raised i the example ad some broader ideas i Statistics. The Mixed Model Predictor C09ed3v.doc 0/7/009 5:40 PM

3 May frameworks ca be used to develop the BLUP uder a MM as discussed by Robiso (99). The framework we use begis with a additive respose error model for each subect, ad assumes exchageability of the correspodig latet values. The study subects costitute a set that may or may ot have bee obtaied as a result of a probability sample from a populatio. I the Seasos Study, for example, the study subects costitute a voluteer subset of members of the Fallo Health Maiteace Orgaizatio. We start with a set of subects, labeled =,..., ad assume that repeated resposes, Y k, k =,..., r s, are associated with subect. The data for the set correspod to the pairs,, ( Y Y Yr ), =,...,. We assume that the resposes associated with subect are idepedet ad idetically distributed radom variablesy k, k =,..., r, ad E Y = y as the latet value for subect ; we also let ( Y ) = σ deote the defie ( ) R k var R k correspodig respose error variace. The subscript R idicates expectatio with respect to the distributio of the respose error. For simplicity, we cosider a sigle measure for subect ad drop the subscript k so that the respose error model may be writte as Whe r >, Y ad Y = y + E. () E correspod to the average respose ad average respose error, respectively, ad σ represets the variace of these averages, which we assume kow. The latet values, y, are the parameters of iterest. Without additioal assumptios, the respose for subect, amely, Y, is the best estimator of the subect s latet value. We defie a MM by addig to the respose error model the assumptio that the latet values for the subects are a realizatio of P = ( P P P ), a exchageable vector of radom variables whose possible equally likely values are the latet values of the subects i the set. A realizatio of P is a MM-latet value which we re-parameterize as P = μ + a, =,...,, ad for which we assume that Eξ ( a ) = 0, Eξ ( aa ) = γ whe =, or Eξ ( aa ) = γ otherwise, with μ = y ad γ = ( y ) μ. The subscript ξ = = idicates expectatio with respect to the distributio of latet values. Oe possible realizatio of P is y = ( y y y ), while other possible realizatios of P are permutatios of these values. The MM is give by Y = μ + a + E () C09ed3v.doc 0/7/009 5:40 PM 3

4 or i matrix form, by where = ( ) Y = Xμ + Za+ E Y Y Y Y, X=, a colum vector with all elemets equal to, Z= I, a idetity matrix, = ( ) E = ( E E E ), a vector of respose errors. I the MM, E ξ R ( ) ( Y ) a a a a, the vector of radom effects, ad Y = X μ, while var ξ R = Ω, where Ω= Γ + σ, Γ = γ I J, with = J =, ad σ deotes a = matrix with diagoal elemets σ ad off-diagoal elemets equal to zero. Every realizatio of Y i () correspods to a idetical realizatio of Y i (), but ot vice-versa. Let the target correspod to a liear fuctio of P give byt = g P. Wheg = e, a vector whose elemets are all equal to zero except the elemet i row that is equal to oe, the target is P, ad the correspodig BLUP is a liear fuctio of Y that is ubiased ad has miimum expected mea squared error (MSE). We may show (see Appedix A for details) that the BLUP of P is P = μ + k Y μ (3) ( ) where μ is the weighted least squares (WLS) estimate of the mea, i.e., μ = wy, with w = γ + σ γ + σ =, ad k γ =. The expected MSE of the predictor, give by γ + σ ( k var R P P) ξ = σ k +, where k = k, is smaller tha the expected MSE ( σ ) k = attaied whe we use the subect s respose as a estimate of the subect s latet value, (see Appedix B). Sice the oly realizatio of T that is ot artificial is y, P may be cosidered a better estimate of y tha the observed respose, Y. The mixed model give by () is defied for a set, ad ot ecessarily for a potetially realized sample from a populatio. Oe way to itroduce the idea of a populatio i the MM is to assume the latet values for the subects i the set are the realized latet values from a sample from a populatio. Whe the populatio size N is large so that the fiite populatio samplig fractio ca be igored, varξ ( P) = γ I, where N replaces i the defiitio of γ. A alterative mixed model cosiders the data to be the realized respose of a radom sample of subects from a (fiite) populatio (Staek ad Siger 004). We refer to this model as the fiite populatio mixed model (FPMM), ad ote that the BLUP give by (3) is ot the same as the FPMM- = C09ed3v.doc 0/7/009 5:40 PM 4

5 BLUP. These differeces warrat a closer examiatio of the model ad uderlyig assumptios, which we cosider via a simple eumerative example. Examples We discuss a simple example to compare the MM ad FPMM BLUPs. Sice the FPMM is defied for a simple radom sample from a populatio, we begi by defiig such a populatio, eve though the MM requires oly subects i a set. The data correspod to = subects, Daisy, labeled s = ad Rose, labeled s = 3,who are members of a populatio of N = 3 subects summarized i Table. - isert Table here Note that the respose error variace differs betwee subects. We begi with a discussio of the MM. To match the otatio used for the MM, first, we order the labels from smallest to largest, idexig the smallest label ( s = ) by = ad the ext smallest label ( s = 3) by =. Next, we assume that for subect, the respose error ca take o two equally likely values correspodig to σ or σ. Uder the respose error model (), each respose (correspodig to a pair of values for the sample set) is equally likely with probability ¼. With these assumptios, we display i Table the potetially realizable resposes correspodig to the four combiatios of respose error. - Isert Table here- The potetially realizable resposes i Table are possible resposes (which we idex by t ) for the MM whe the realizatio of P (the MM-latet values) is y. Sice the latet values are assumed to be exchageable ad =, there are two possible realizatios of P. Resposes for the other realizatio of the MM-latet values are listed i Table 3. - Isert Table 3 here- The resposes for the MM listed i Tables ad 3 correspod to the equally likely realizatios, Y t, for t =,...,8, of Y, each occurrig with probability /8. The correspodig realizatios, P t, t =,...,8, of P are the realized MM-latet values. Whe t =,...,4 (as i Table ), the realizatios of P ad Y correspod to y ad Y i (), respectively. For such data, Daisy s realized MM-latet value is 0. The realizatios of P ad Y are artificial whe t = 5,...,8 (as i Table 3). I this case, Daisy s realized MM-latet value is. The BLUP of P uder the MM give by (3) for each realized MM-respose is give i Table 4. The colums i Table 4 are orgaized i two paels, with the first pael correspodig to Daisy, ad the secod pael, to Rose. The differeces, P P, ad the correspodig squared differeces are give i last two colums of each pael. Notice that the average differece is C09ed3v.doc 0/7/009 5:40 PM 5

6 0 ξ =. The average squared differece, or MSE, is 0.99 for Daisy ad 3.77 for Rose. These values are smaller tha those that would result from a best liear ubiased estimator (BLUE) usig model (), amely σ = for Daisy, ad σ 3 = 4 for Rose. zero, satisfyig the ubiased costrait give by E R( P P) -Isert Table 4 here- There are some problems with these results which ca be illustrated by focusig o the MMresposes for Daisy (first pael of Table 4). Notice from Table that Daisy s latet value is 0, while is also listed as a latet value for Daisy i Table 4. The MM-latet value of for Daisy correspodig to the MM-resposes t = 5,...,8 exists oly i the mixed model, ot i reality. Such a latet value is artificial, ad oe could argue that it should ot be give a positive probability i the aalysis. This is ot due to a differet iterpretatio of the subect labeled =, sice this label oly correspods to Daisy i the model defiitio. These results shed light o the iterpretatio of bias ad o the defiitio of the MSE for the MM. I order to compute bias, we eed to subtract the subect s actual latet value for all settigs as show i Table 5. Usig the subect s actual latet value, the BLUP give by (3) is biased for each subect, ad its MSE is larger tha the MSE of the BLUE based o model (). -Isert Table 5 here- I the MM, positive probability is give to MM-resposes that are ot potetially realizable. By averagig over these artificial resposes i additio to the potetially realizable resposes, the coectio betwee the MM ad reality is broke. This creates cotradictios i the iterpretatio of results. For example, the latet value for Daisy ( = ) is 0 for all potetially realizable resposes, but the expected value of the correspodig MM-latet values is Eξ ( P) =. R 6 To retai the iterpretatio of the latet value for the subect, the MM-latet value should be defied oly over the potetially realizable MM-resposes i.e., correspodig to t =,...,4. Thus, the target quatity must be defied coditioally o the potetially realizable resposes, eve though the MM is defied ucoditioally. Defiig the MM-latet value as the expected value of P oly over the potetially realizable resposes results i a latet value equal to 0 for Daisy ad i a latet value equal to for Rose, which correspod to their true latet values. Restrictig evaluatio of (3) to potetially realizable resposes provides some isight o bias ad MSE. The coditioal bias is give by E P P P= y = k y μ, ( ) ( )( ) R (see Appedix C). Therefore, the coditioal bias for Daisy is -0., while the coditioal bias for Rose is The average coditioal bias over the subects is ot equal to zero. Usig a similar defiitio for the MSE (see Appedix C), i.e., C09ed3v.doc 0/7/009 5:40 PM 6

7 (( ) ) ( ) ( ) ( k ) ER P P P = y = k w σ + y μw + kσ +, (4) = k where μ = w y, it follows that the MSE for Daisy is ad for Rose, both w = smaller tha the MSE of the simple resposes, Y, =,. Estimatig the Mea Latet Value Our developmet has focused o estimatig the MM-latet value for a subect. We ca use similar methods to obtai a estimate of T = g P where g = i the MM a target that correspods to the average MM-latet value, P. The correspodig BLUE is the weighted least squares (WLS) estimator give by μ. Notice that P is equal to y = y =, the mea of the latet values i the respose error model (). The BLUE of y i () is the mea respose, Y Y = =. Sice P = y, it is temptig to compare the BLUE obtaied uder model () with the BLUP obtaied uder model (), as illustrated i Table 6. -Isert Table 6 here- Uder model (), there are o resposes comparable to the MM-resposes for t = 5,...,8. This is a cosequece of the iclusio of artificial resposes i the MM. The target parameter, P, is costat over all possible MM-resposes. If we defie a estimator similar to Y for the MMresposes as Y Y = =, the the MSE of space. The MSE of μ, give by ( ) Y ad μ ca be evaluated over the same sample k E ξ R μ P = γ, is less tha the MSE of k Y, give by ( ) ξ R which equals the MSE of Y uder model (), i.e., ( ) ER Y y σ E Y P providig the usual ustificatio for the use of μ istead of Y. = =, The WLS estimator is ubiased whe evaluated over all resposes. Evaluated over potetially realizable resposes, i.e. those correspodig to t =,...,4, the bias is ( k k Eξ R P T P= y ) = y. The ubiased property of the WLS estimate of the latet = k value mea holds oly whe expectatio is take over all possible resposes, icludig those artificial resposes that are ot potetially realizable. The MSE, evaluated oly over potetially realizable resposes, is C09ed3v.doc 0/7/009 5:40 PM 7

8 ( = ) = σ + ( ) MSE R P T P y ξ w w y =. = Whe =, as i the example illustrated i Table 6, this expressio simplifies to ( ) k E ξ R μ P = γ. Whe >, as illustrated ext, the coditioal MSE of the MMk BLUP is ot equal to its ucoditioal MSE, ad may be larger (or smaller) tha the ucoditioal MSE. A Slightly Larger Example. Although i the first example, with =, it was possible to eumerate all outcomes, some issues that occur more geerally could ot be revealed. We briefly discuss a secod example where the data correspod to = 3 subects, Daisy ( s = ), Lily ( s = ) ad Rose ( s = 3), to raise such issues. We order the labels i the set from smallest to largest, idexig the smallest label by = ad the ext smallest label by =, ad the largest label by = 3, ad assume that respose error for a subect ca take o two equally likely values correspodig to σ or σ. With these assumptios, there are eight equally likely possible potetially realizable resposes correspodig to the differet combiatios of respose error (Table 7). -Isert Table 7 here- The t =,...,8 potetially realized resposes i Table 7 are possible resposes (which we idex by t ) for the MM whe the realizatio of P (the MM-latet values) is y. Sice the latet values are assumed to be exchageable ad = 3, there are six possible realizatios of P. Replacig y by each of these realizatios gives rise to 40 artificial resposes that are ot realizable, but are icluded with positive probability i the MM. The predictor of P give by (3) uder the MM for Daisy is listed for t =,...,8 i Table 8, ad for t = 9,...,48 i Table 9. We summarize the results for the MM-BLUP of each subect i Table 0. -Isert Table 8-0 here- Notice that whe averagig over the potetially realizable resposes ( t =,...,8 ), the MM-latet value is the subect s latet value. The average squared differece betwee the MM-BLUP ad the MM-latet value for the potetially realizable respose is larger tha a similar average over the o-realizable resposes for Daisy ad Rose, but ot for Lily. It is the overall average MSE (overt =,..., 48 ) that is usually evaluated for the MM, eve though such a average icludes resposes that are ot potetially realizable. It is of value to cosider the MM-BLUP of P = 5 i this example. Over the potetially realizable resposes ( t =,...,8 ), the average of μ is 6.009, while over the couter factual resposes, the average is Although the simple average of μ over all MM-resposes is C09ed3v.doc 0/7/009 5:40 PM 8

9 equal to P, this ubiased results oly occurs oly if the artificial o-realizable resposes are icluded. The average MSE for the potetially realizable resposes ( t =,...,8 ) is give by.667, while is the average MSE for the couter factual resposes ( t = 9,...,48 ). The simple average MSE (over allt =,...,48 ) give by 3.48 is larger tha the average MSE for the potetially realizable resposes, but smaller tha the comparable average MSE uder the respose error model give by.667. The Fiite Populatio Mixed Model We defie a fiite populatio mixed model by cosiderig the data to be the realized respose of a simple radom sample of subects from a fiite populatio, assumig a sigle respose for each subect. We defie subects, latet values, ad respose i the populatio usig similar otatio as i model () by defiig the populatio as a set of N subects. We use the subscript s to label subects i the populatio, ad ote that y ad E represet N N vectors of latet values ad respose error, respectively. With this defiitio, μ = ys N s= N correspods to the usual fiite populatio parameters of the mea, while N γ correspods to N the usual fiite populatio variace, where γ = ( y ) s μ. We defie Y as a N N s= respose vector with elemets Ys = ys + Es, s =,..., N, so that the respose error model for the populatio is give by Y = y + E (5). We defie a sample as a sequece of subects, ad use i =,..., to idex the subects i the sequece. We idex the possible sequeces of subects by h, where =,..., H ad N! H =. Let yhi deote the latet value for the subect i positio i i sequece h ( N ) ad! defie the sample vector of latet values by y ( y y y h = h h h ). This geeral represetatio of a sample was used by Godambe (955). We defie respose for sequece h by Y = u Y h so that the elemetyhi deotes respose for the subect i positio i i sequece h, Yh = ( h h h ) u = ( u u u ) h h h h ( u u u ) Y Y Y, ad is a matrix of costats with colums give by u hi = hi hi hin for i =,...,. The elemet uhis has a value of oe if subect s is i positio i i sequece h, ad zero otherwise. For example, whe = ad N = 3, the data for sequece h cosistig of subect s = 3 followed by subect s = is (( s = 3, Y ) ( s, Y = ) ) C09ed3v.doc 0/7/009 5:40 PM 9

10 u 0 u where u u u 0 0 = =. Latet values ad respose errors for the subects i u u sequece h are defied i a similar maer by y = u y ad E = u E, respectively. While it is h h h h possible to relate the respose for a subect i positio i i sequece h to the respose for subect defied by the respose error model () (see Appedix D), it is importat to ote that the subect i positio i i sequece h is ot ecessarily the same subect as the subect labeled i model () sice for oly oe sequece will the order of the subects i () match the subect s positio i the sequece. The fiite populatio mixed model is defied by assumig a sample correspods to a radomly selected sequece, where I h represet a idicator radom variable that has a value of oe whe sample sequece h is selected, ad zero otherwise, ad subsequetly summig the idicator radom variable for as over possible sequeces. We assume that all sample sequeces are equally likely (correspodig to simple radom samplig without replacemet), so that Ep ( I ) = (where the subscript p idicates expectatio with respect to samplig). Next, let H H H YI = I Y be a vector represetig the sample respose. Defiig U I = I u with = elemets U variables, Y H is = I u his = N Ii UisYs s=, i=,...,, s =,..., N, Y = I I = U Y is a vector of sample radom =, i =,...,. Usig (5) ad defiig subect effects by βs = ys μ, s =,..., N, the fiite populatio mixed model may be writte as Y = μ + b + E (6) where b N = U β ad E i is s s= N Ii is s s= Ii i Ii = U E, or i matrix form as Y = Xμ + Zb+ E I where b= U β, ( ) I b = b b b, β = ( β β β N ) ad E = I UI E. This represets the sample radom variables i the fiite populatio as defied by Staek ad Siger (004). The radom variable b i correspods to the deviatio of the subect s latet value from the populatio mea for the subect i positio i i a radomly selected sequece. I Let the target correspod to a liear fuctio of PI = Xμ + Zb give by T I = g P I. Whe g = e i, a vector whose elemets are all equal to zero except the elemet i row i that is equal to oe, the target is P Ii, ad the FPMM-BLUP is a liear fuctio of Y I that is ubiased ad has miimum expected mea squared error (MSE). We show (see Appedix E for details) that the FPMM-BLUP is P = Y + k Y Y Ii ( Ii ) C09ed3v.doc 0/7/009 5:40 PM 0

11 YIi i = N σ s N s= γ where Y = is the sample average respose, k =, ad σ =. The γ + σ expected MSE of the predictor is ( σ varpr PIi PI ) = + k( ). Whe TI = μ, g =, PI = Y, ad ( σ γ varpr PI P) = ( f ) +. Of particular iterest is a compariso of the average MSE for potetially realizable resposes. For all sample sets ad subects, the MM-BLUP MSE is smaller tha the MSE of the observed respose, ad smaller tha MSE of the FPMM-BLUP. For a give sample sequece, the FPMM- BLUP is biased, with the bias give by E ( R PIi PIi I ) ( = k)( y μ ) (see Appedix F). Coditioal o a sample sequece, the MSE of the FPMM-BLUP is give by MSE ( ) ( ) ( ) ( ) ( ) ( ) pr PIi PIi I = k + kσ + k σh + k y μh. (7) Examples We cosider the FPMM-BLUP for simple radom samples of size = from the populatio of N = 3 subects listed i Table. First, ote that there are six possible sample sequeces, with! = sequeces for each sample set. Sice the FPMM-BLUP is idetical for a subect i differet sequeces i the same set, we list the t =,..., possible equally likely sample resposes i Table correspodig to the three sample sets. - Isert Table - Notice that the FPMM-BLUP is a biased predictor of the subect s latet value for each subect, but the average bias (over all subects) is zero. The MSE differs betwee subects, ad exceeds the MSE of the observed respose for Daisy ad Rose, but is smaller (48.58 vs 00) tha the MSE of the observed respose for Lily. The average MSE of the FPMM-BLUP (i.e., 3.66) over all subects is smaller tha the average MSE of the observed respose (i.e., 35). Table provides a summary of the MM-BLUP ad the FPMM-BLUP for the three differet sets of = from the populatio of N = 3 listed i Table. Recall that the MM- BLUP is defied for each particular set, while the FPMM-BLUP is defied over all sets. The results i Table are arraged i paels of rows correspodig to average predictors of Daisy, Lily, ad Rose s latet values. Colums correspod to the average predictor, the bias, ad the MSE. The bias ad MSE are evaluated for the MM model relative to the MM-latet value, ad relative to the subect s true latet value. The potetially realizable resposes correspod to rows where t =,...,4. The last three rows i Table summarize the average results over potetially realizable resposes, over couterfactual resposes that are ot potetially realizable, ad over all resposes. Notice that the average bias over all resposes is zero for each predictor, but whe bias is calculated oly over potetially realizable resposes, the MM-BLUP is biased, while the FPMM_BLUP is ot. The results i Table illustrate overlappig but distict sample spaces that uderlie the MM ad the FPMM predictors. C09ed3v.doc 0/7/009 5:40 PM

12 A Example with N = 4 ad = 3 We cosider a slightly larger example to compare the MM-BLUP ad FPMM-BLUP. The example is for a populatio of N = 4 where a simple radom sample of = 3 subects is selected, resultig i four possible sample sets. The populatio cosists of the origial populatio give i Table, ad a additioal subect, Violet, with a latet value ad respose variace give by y 4 = adσ 4 = 5, respectively. The compariso is made for each sample set- assumig that the set costitutes a populatio for the FPMM. This meas that γ ad σ are differet for differet sets. We compare the MSE of the estimates of subect s latet values from the MM-BLUP (4) ad the FPMM-BLUP i Table 3. The results idicate that the MSE of the MM-BLUP is smaller tha that of the FPMM-BLUP i most, but ot all settigs. The estimate of Violet s latet value based o a FPMM-BLUP has smaller MSE tha the MM-BLUP i sets that iclude Daisy. Discussio The compariso of the model-based formulatio of the mixed model () ad the fiite populatio mixed model (6) via the examples provides some isight as well as revealig the opportuity for cofusio i discussios of mixed models. First, the compariso provides some clarity to Robiso s (99) discussio of whether the MM-BLUP should be termed a estimator or a predictor, ad uderscores the difficulty that Hederso (975) had i providig a covicig iterpretatio of the MM-BLUP. Hederso (984, page 37) posed the problem as to Which is the more logical cocept, predictio of a radom variable or estimatio of the realized value of a radom variable? If we have a aimal already bor, it seems reasoable to describe the evaluatio of its breedig value as a estimatio problem. O the other had, if we are iterested i evaluatio the potetial breedig value of a matig betwee two potetial parets, this would be a problem i predictio. The termiology of estimatio applies to the MM-BLUP whe the aimal is already bor, while predictio applies to the FPMM-BLUP whe the matig parets have yet to be selected. The iterpretatio of ubiased is also clarified. I the mixed model, we ca distiguish Eξ Y μ E ξ Y = P (the MM-latet value for subect ) from R ( ) = from ( ) ( ) R ξ = = R E Y P y y, the true latet value for subect. If our iterest is i the latet value for subect the ubiased property of the MM-BLUP is defied as E ( ξ R P) from the usual defiitio of ubiased, give by E ( Rξ P = ) = y = μ. This differs P y. Neither the MM-BLUP or the FPMM-BLUP are ubiased whe this defiitio is adopted. The MM-BLUP is a biased estimator of the subect s latet value, while the FPMM-BLUP is a biased predictor of the realized radom effect. Icludig U i the BLUP termiology may provide reassurace that BLUP s are OK for those who cosider lack of bias a pre-requisite for aalysis. But truth would C09ed3v.doc 0/7/009 5:40 PM

13 be better served if both MM-BLUPs ad FPMM-BLUPs were described as biased but more accurate ways of estimatig a subect s latet value. A importat aspect of the parallel developmet of the MM ad FPMM is illustratig the overlappig but distict sample spaces. Sice the examples we cosidered are small ad the outcomes are discrete, it is possible to make the sample spaces explicit. More geerally, the sample space is the product of possible realizatios of P ad E. If respose error has m values for each subect, both the sample spaces for the MM ad for the FPMM whe = N have m! possible values. These sample spaces overlap for the m values correspodig to P = y. The additioal (! ) m values i the MM whe P y are artificial, while the (! ) m resposes i the FPMM correspod to differet permutatios of the subects that are all potetially realizable. The differece i the MM-BLUP ad FPMM-BLUP is due to their developmet over the differet sample spaces. We advocate evaluatig statistics over sample spaces that are potetially realizable. This guidelie requires statistics to be liked to reality, implyig that oly a portio of the sample space be used to evaluate estimators i the MM. It is cosistet with Tukey s commet i discussio of Nelder (977) that our focus must be o questios, ot models. By limitig evaluatio of the estimators from the two formulatio of the mixed model to the potetially realizable sample space, we keep the focus o a real questio. With this focus, as illustrated via the examples, the MM-BLUP of a subect s latet value is ot uiformly more accurate tha the simple sample mea, or the FPMM-BLUP. More study i this area is clearly eeded. Guidelies are lackig for estimator choice; uderstadig is lackig o how to artificially expad a sample space to produce more accurate estimators; practical issues where variace parameters are ukow are yet to be explored; ad extesios to settigs with auxiliary variables are ot cosidered. The distictio betwee potetially realizable poits i the sample space ad artificial sample poits i the MM provides a cotext for uderstadig the cocer expressed i much of the classical statistical literature that oly variace compoets should be estimated ad radom effects should ot be predicted. First, otice that eve though the MM icludes artificial sample poits, there is a uderlyig physical reality to γ (whe defied for the set or for a populatio). This provides legitimacy to estimatig variace compoets. The ratioale for cocer over predictig radom effects i a MM is also evidet. For a subect, there is a differece betwee the MM-latet values, ad the subect s latet value. I the MM, the latet value associated with a subect is ot costat, but chages for differet sample poits. There is o reaso to be iterested i the latet value for the subect that is assiged to the artificial sample poits. This reasoig provides the logic behid a statemet that predictio of radom effects has o meaig. Our uderstadig of this cocer chages if we cosider estimatio of realized radom effects, where the term realized implies limitig cosideratio to sample poits that are potetially realizable. By restrictig the sample space to such poits, the MM latet value is costat for a subect, ad equal to the subect s latet value. Estimatio of the realized C09ed3v.doc 0/7/009 5:40 PM 3

14 radom effect i the MM is meaigful, as is predictio of the realized radom effect i the FPMM. There is a simple coectio betwee the MM ad Bayesia methods. The distributio of P i the MM has bee termed the obective prior distributio, as i Robiso (99). It has a simple iterpretatio as the distributio of subect s latet values, ad characterizes atural variatio betwee subects. If < N, expadig the distributio of P to be a subset of radom variables from a exchageable distributio of latet values i the populatio, which we deote by P N will expad the umber of artificial respose i the MM sample space, but ot alter the umber of potetially realizable resposes. Although each realizatio of P N is a set of latet values from the populatio, this expasio does ot make the estimator based o a set of subects from the MM more geeral, or does it guaratee that the resultig estimator will be more accurate. The accuracy of estimators that are developed from such models should be evaluated oly over potetially realizable poits i the sample space. Such a evaluatio may provide isight as to whether artificial expasio of sample spaces ca give rise to more accurate estimators. It is possible to expad the discussio of Bayesia cocepts to iclude a distributio of fixed effects, which Robiso (99) refers to as a subective prior. A sample poit i the resultig oit distributio must have parameters equal to those i the actual set of subects i order for potetially realizable resposes to be icluded i the sample space. The extesio to subective priors exteds oly the umber of artificial poits i the sample space, ad does ot alter the set of potetially realizable resposes from which the resultig estimator should be evaluated. Still, it is possible that such a extesio of the artificial sample poits will produce a more accurate estimator i some settigs. This is aother area deservig further study. There is a firm coectio betwee the MM ad the FPMM i the survey samplig literature datig back to Godambe s (955) ad Godambe ad Joshi s (965) importat papers. This work stimulated a crisis i the foudatios of statistical iferece, as summarized by Cassel et al. (977). We discuss this coectio, sice it provides a uifyig framework for ideas of statistical iferece. Godambe (955) cocluded that there is o best liear ubiased estimator of a fiite populatio total based o probability samples. This result was startlig sice the sample mea from a simple radom sample is commoly preseted as the BLUE of the populatio mea. Importat ideas i Godambe s developmet iclude the very geeral defiitio of a liear estimator, ad the eed of additioal assumptios beyod samplig to obtai a optimal estimator. The liear estimator proposed by Godambe (955) icludes separate coefficiets for each subect i each positio i a sample, where sample poits correspod to realizatios of the subset of the first radom variables represetig a permutatio of subect values i a fiite populatio. Subsequetly, Godambe ad Joshi (965) cocluded that it was sufficiet for coefficiets to be defied for each subect i a sample set, ot a sample sequece. Optimal coefficiets ca icorporate subect specific iformatio, such as differet respose error variaces, sice subects are idetifiable i a set. Additioally, sice the sample set is the startig poit, the coectio back to the possible samplig probabilities is ot relevat, sice iferece is coditioal o the sample set. C09ed3v.doc 0/7/009 5:40 PM 4

15 The settig cosidered by Godambe (955) did ot iclude respose error. Addig respose error to a subect s latet value i Godambe s basic model does ot alter the coclusio of oexistece of a BLUE, eve though it is possible to specify a set of estimatig equatios. While the equatios ca be solved, the solutio does ot result i a estimator sice it icludes osample latet values. Godambe (955) itroduced additioal a priori model assumptios (motivated by icludig a auxiliary variable) i order to develop a estimator of the populatio total. These assumptios are similar to the MM assumptios o latet values. As a result, the MM ca be cosidered to be a variatio o the suggestio by Godambe (955). These basic ideas are the foudatio for superpopulatio models i survey samplig. We idetify aspects of these models that are related to the MM. First, there is a coectio betwee the realized sample ad the superpopulatio, which we defie i terms of a set of latet values give by realizatios of P = μ + a. These latet values eed ot be simply the latet values for the subects i the sample set, but could be defied quite geerally. Whe the latet values for the subects i the sample set are icluded i this defiitio, it is always possible to cosider the realized sample as a possible sample poit i the superpopulatio model. Notice how this defiitio obscures the iterpretatio of a superpopulatio, sice the oly idetifiable subects are those i the sample. While it may be appealig to thik of a superpopulatio as a larger fiite populatio (as i Voss (999)), there is o eed to do so. The FPMM is the result of movig i a differet directio as a cosequece of Godambe s oexistece results. Rather tha icludig additioal assumptios to the model for a sample set, the FPMM collapses radom variables to a lower dimesioal space. Oe casualty i the collapsig is a loss i idetifiability of subects for the FPMM radom variables. This idetifiabiltiy is lost whe developig predictors of realized radom effects, but re-gaied oce the subects i the sample set are realized. I order to maitai these distictios, Staek ad Siger (004) have described the FPMM-BLUP as a predictor of the latet value of a realized subect i a positio i a sample. The termiology by itself is cofusig, sice the positio is ot of substative iterest i a practical problem, ad the subect (whose latet value is of iterest) is ot idetifiable. A advatage of the approach is the iclusio oly of sample poits that are potetially realizable, a fact that simplifies assessig performace of the predictor. While the simple examples illustrate that the FPMM predictor may outperform the MM predictor i some settigs, guidace for its use is curretly lackig. Some of the mai ideas i these results are far reachig. First, we coclude that a importat area for ivestigatio of statistical iferece is coditioal o the sample set. Secod, we coclude that it is crucial to evaluate properties of estimators over potetially realizable sample poits, ad ot iclude poits i the portio of a artificial sample space. This simple guidelie ca elimiate debate over iclusio of prior distributios, or other artificial assumptios, sice use i developig estimators is allowed, but the evaluatio of the properties of the estimators is tied to reality. Fially, these results illustrate that there is a lot to be leared. May accepted procedures, models, ad theories appear to be based o ideas that are ot cosistet with these two coclusios. Their re-examiatio may lead to a souder basis for statistics i the future. C09ed3v.doc 0/7/009 5:40 PM 5

16 Appedix A. We develop the predictor of P = X μ +Z a where X = ad Z = e as a liear fuctio of Y, i.e. P = cy, that is ubiased ad has miimum expected MSE i the mixed model followig the developmet by Goldberger (96) as reviewed by Robiso (99). We first ote that X μ ad Y E ξ R = P X var ξ R Y Ω ZΓZ =. The ubiased costrait P Z ΓZ Z ΓZ requires that E ( P P ) 0 R ξ =. Sice ( ) = ξ cy ( μ + Z a) E P P E X ξr R = Eξ cp = cx μ X μ ( X Z ) μ, a where = ( ) P P P P, the ubiased costrait is give by cx = 0. Miimizig X ( P P ) = c c czγz + Z ΓZ ξ Ω with respect to c subect to the ubiased costrait var R results i ( P = X μ + Z ΓZΩ Y Xμ ) where μ ( ) = XΩ X XΩ Y. Sice ( ) Ω = γ + σ γ J, = Ω = k + kk, where γ = ( k) k γ = γ + σ, ( ) k = k k k ad k k Ω = so that = k. Usig this result, X X γ ( k ) μ k = ky. Now ΓZ Ω = k + k k ad hece = ( ) ( ) k ( Ω μ ) = k ( ) μ + μ ΓZ Y X Y X k k Y X = ( ) ( ) ( ) k C09ed3v.doc 0/7/009 5:40 PM 6

17 where ( ) ( ) ( ) μ k k k Y X = 0, so that ( Y μ) ( μ ) = X μ + Z ΓZΩ Y X P = μ+ k. Appedix B. Mea Squared Error The mea squared error of P uder the model for ( P P ) ξ where P = X μ Z ΓZ ( Y X μ ) ( k ) var R ca write P = cy where where Pw k P k = + Ω ad Y is give by P = X μ +Z a. Notice that we c = k + e k. First, observe that k = E P = cp ( ) R ξ = ( k ) k P + k P = + k = ( k) Pw kp Pw k( P Pw) = + =. Now var ( ) ( ξr P P = varξr c Z ) Y Ω Γ varξ R = ad a Γ Γ result, Usig c ad Ω= Γ + σ, var = ξ R Y where a Y = Γ + σ. As a 0 0 = a ( ) P P = σ + + varξ R c Γ c Z Γc Z ΓZ. = Z = e, we expad these terms to obtai C09ed3v.doc 0/7/009 5:40 PM 7

18 var These terms simplify as ad ( P P ) ( k ) ( k) ξ R = = k σ k+ kγk k k ( k ) ( k) + e k+ k e k k kσ kσ = = ( k ) ( k) k k = = + e Γk+ kγ e k k k σ + e e + e k Γ k e = = = ( k ) e Γk e Γ k e + e Γe k = ( k ) ( ) k γ ( k ) ( k) k σ k+ kγk =, k = k k k k e kσ k+ k kσ e + k = k = ( ) ( ) ( k ) ( ) k γ ( k) k( k) + e k Γk+ kγ k e = k = k = k k( k) γ e kσ e + e k Γ k e =, = = = ( k ) + ( k) kγ e Γk e Γ k e + e Γe = γ kγ. k = k Combiig all terms ad simplifyig the resultig expressio, ( ) Appedix C. The Coditioal MSE We evaluate E ( R P P P = ) P = y. As a result, ( ) k Sice c = k + e k, k = y. Now P = cy, ( R ) ( P = y) = ( cy P= y) E P P E y R R = cy y varξ R P P = σ k +., k k. E Y P= y = y, ad P = y whe. C09ed3v.doc 0/7/009 5:40 PM 8

19 where w As a result, cy ( ) = k y + k y = k ( k ) k k k = y + k y y = k = k = μ + k y w k = ad μw = w y. Hece, k = E ( R P P = ) = ( k )( y μw) We evaluate E ( ) R P P P = y by otig that give = Now ( ) E R = ( ) P y. ( ( ) ) ( ) (( ) ) ( ce ( y cy )) P y E P P P = y = E c y + E y P= y R R EE = σ. As a result, ( ) = E = R ( ) ( y ) = c E EE c + c y R (( ) ) = = σ + ( ) = ( k) ( k) ( k ) P y, P = = ( + ) cy c P E. ER P P P y c c y cy. Now ( ) ( ) k k c σ k σ k = c = k + e + k e k = = k = = k σ k σ k + e k+ e k = = k = ( ) k = k σ k σ k σ k+ e k+ e e k = k = = ( k ) ( k) = k σ + k σ + k σ cy = k μ + k y, Usig ( ) w k = k.. C09ed3v.doc 0/7/009 5:40 PM 9

20 As a result, (( ) P y) ( cy ) ( ) y = y k μw ky ( k)( y μw) ( k) ( y μw) = = ( k ) ( k) E P P k k k k y. ( ) ( ) R = = σ + σ + σ + μw k = k ( k ) = ( k) w σ + ( y μw) kσ + + = k. Appedix D. Relatioship Betwee Models for Sequeces ad Sets Respose for sequece h give by Yh ca be related to respose defied by the respose error model (). To see this, we represet the idicator variables that defie sequece h i u h as a permutatio (defied by v m ) of elemets i set h (defied by δ h ) such that u = δ v h h m where δh δh δh N δh δh δhn δ h =, δh δh δhn th δ hs has a value of oe if the smallest subect s label i set h is for subect s, ad zero otherwise, ad vm vm vm vm vm vm v m = vm vm vm th with elemets v mi havig a value of oe if the smallest label i set h is i positio i, ad zero otherwise. For example, whe = ad N = 3, the data for sequece h cosistig of subect s = 3 followed by s = is (( s = 3, Y ) ( s =, Y h h) ) 0 0 permute the subects i set h defied by δ h =. 0 0 Appedix E. is defied by usig 0 v m = to 0 C09ed3v.doc 0/7/009 5:40 PM 0

21 Z i = The FPMM-BLUP of a sample subect s latet value, PIi = Xiμ + Zb i where X i = ad e may be obtaied similarly to the developmet i Appedix A. We first ote that i γ Y I X Γ = I J ad Ω= Γ+σ I so that E pr = μ ad N P Ii X i Y I Ω ZΓZi varpr =. The predictor is a liear fuctio of Y I give by P I = cy I such P Ii ZiΓZ ZiΓZi that E ( P P ) = 0 which implies that cx = 0. The FPMM-BLUP is give by pr I Ii + Z ΓZ Ω ( Y X ) where μ ( ) P = X μ μ Ii i i I X i = X Ω X X Ω Y. Sice I k Ω = I + J, X Ω X = where f =, γ + σ N k γ + ( f ) σ N X Ω Y = I Y, ( ) μ = X Ω X X Ω Y γ + σ fk I I = Y ad k Z iγz Ω = k e i + N k. Sice Y I X I μ = I J Y, ( ) Z iγz Ω YI Xμ = ke i I J YI so that PIi = Y + k( Y Ii Y). Usig these expressios, P Ii = + ke i I J Y where I Y I c i = + ke i I J. Now varpr = Γ + σ I. As a result, b 0 0 var ( ) ( ) pr PIi PIi = c i σ I + Γ c i ZΓc i i + ZΓZ i i. Now ( ) c ( ) i σ I + Γ ci = c i σ + γ I γ J ci N, = ( σ + γ ) cc i i γ cj i ci N ad cc i i= + k while cj i ci = resultig i ( ) c i σ I + Γ c i = ( σ + γ ) + k γ. N Also, Z Γc i i = γ + kγ N ad Z iγz i = γ. Combiig these terms ad N simplifyig, ( σ varpr PIi PIi ) = + k( ). C09ed3v.doc 0/7/009 5:40 PM

22 Appedix F. The MSE of the FPMM-BLUP for a Sample Set The FPMM-BLUP of a sample subect s latet value, PIi = Xiμ + Zb i where X i = ad Zi = e i P = Y + k Y Y. The predictor is developed over all possible sample sequeces idetified is Ii ( Ii ) u δ v where a realized sample sequece is the realizatio of by = h h m ( P P I ) var pr Ii Ii h. Sice Y I v δ Y ad P I = v δ y, I = m h I m h H Iu =. We evaluate h P ( Ii PIi I δ Y = cm gm ) δhy where we defie c m = c iv m ad g m = gv m. As a result, var ( ) var ( ) pr PIi PIi I =c mδh R Y δhcm. Now δ h varr( Y) δ h = σ ad c m = + ke i v m J. Notice that ev i m = e with elemets = e = ev i mi so that we ca express c m = + ke I J. As a result, defiig i= σ h σ = =, ( P P I ) pr Ii Ii = c m σ cm = var Also, settig μ = h h y, the MSE is give by = σ h + k ( σ σh ) + k σ σ + σh. = ( ( ) k+ kσ ( ) ) + k σh ( Y E ) ( R PIi PIi I = c m gm) ER y, = ( k)( y μ ) ( ) ( ( ) ( ) ) ( = + σ + σ + ) ( μ ) MSEpR PIi PIi I k k k h k y = ( ) k kσ ( k) ( σ ) ( k) ( y μ ) h h. C09ed3v.doc 0/7/009 5:40 PM

23 Refereces Brow, E.M., ad Kass, R.E. (009). What is statistics? (with discussio), The America Statisticia, 63: Cassel, C.M., Särdal, C.E. ad Wretma, J.H. (977), Foudatios of Iferece i Survey Samplig, New York, NY: Joh Wiley. Godambe, V.P. (955). A uified theory of samplig from fiite populatios. Joural of the Royal Statistical Society B. 7: Godambe, V.P. ad Joshi, V.M. (965). Admissibility ad Bayes estimatio i samplig from fiite populatio. I. The Aals of Mathematical Statistics Hederso, C.R. (975). Best liear ubiased estimatio ad predictio uder a selectio model, Biometrics 3: Hederso, C.R. (984). Applicatios of Liear Models i Aimal Breedig. Uiversity of Guelph, Guelph Caada (ISBN ). Koch, G.G. (967). A procedure to estimate the populatio mea i radom effects models, Techometrics 9: Koch, G. G. (973). A alterative approach to multivariate respose error models for sample survey data with applicatios to estimators ivolvig subclass meas, Joural of the America Statistical Associatio, 68: Koch, G. G., Gilligs, D.B., ad Stokes, M.E. (980). Biostatistical implicatios of desig, samplig, ad measuremet to health sciece datat aalysis, A. Rev. Public Health :63-5. C09ed3v.doc 0/7/009 5:40 PM 3

24 Merriam, P.A., Ockee, I.S., Hebert, J.R., Milagros, C.R., ad Matthews,C.E. (999). "Seasoal variatio of blood cholesterol levels: study methodology," Joural of Biological Rhythms. Vol. 4 No. 4, Nelder, J.A. (977). A reformulatio of liear models w(with discussio). Joural of the Royal Statistical Society A Ockee, I.S., Chiriboga, D.E., Staek, E.J.III, Harmatz, M.G., Nicolosi, R., Saperia, G., Well, A.D., Merriam, P.A., Reed, G., Ma, Y., Matthews, C.E. ad Hebert, J.R. (004). Seasoal variatio i serum cholesterol: Treatmet implicatios ad possible mechaisms. Archives of Iteral Medicie, 64: Robiso, G.K. (99). That BLUP is a good thig: the estimatio of radom effects, Statistical Sciece, 6:5-5. Staek E.J. III ad Siger, J.M. (004), Predictig Radom Effects from Fiite Populatio Clustered Samples with Respose Error, Joural of the America Statistical Associatio, 99: Voss, D.T. (999). Resolvig the Mixed Models Cotroversy. The America Statisticia C09ed3v.doc 0/7/009 5:40 PM 4

25 Table. Populatio Values ad Parameters for Simple Example Latet Respose Value Variace Subect's Subect Name s y s σ s Daisy 0 Lily 3 00 Rose 3 4 Source: c09ed33.xls Table. Potetially Realizable Resposes for the Set of Subects of s = ad s = 3 Assumig a Mixed Model Source: c09ed33.xls Potetially Realized Respose t Subect Respose Variace σ Mixed Model Latet Value P Respose Error E MM Respose Y t Daisy (=) 0 Rose (=) 4 4 Daisy (=) 0-9 Rose (=) Daisy (=) 0 Rose (=) Daisy (=) 0-9 Rose (=) 4-0 Table 3. Additioal (o-realizable) Resposes for the Set of Subects of s = ad s = 3 Assumig a Mixed Model Source: c09ed33.xls Norealizable Respose t Subect Respose Variace σ Mixed Model Latet Value P Respose Error E MM Respose Y t 5 Daisy (=) 3 Rose (=) Daisy (=) - Rose (=) Daisy (=) 3 Rose (=) Daisy (=) - Rose (=) C09ed3v.doc 0/7/009 5:40 PM 5

26 Table 4. Predictors of MM-Latet Values, Differece from P, ad MSE for the Set of Subects s = ad s = 3. t Mixed Model Respose Daisy (=) Y P Mixed Model Latet Value P Differece P P MSE ( P ) P Mixed Model Respose Rose (=) P Y Mixed Model Latet Value P Differece P P MSE ( P ) P Average Source: c09ed33.xls Table 5. Predictors of Subect s Latet Values, Differece from Subects s = ad s = 3. t Mixed Model Respose Daisy (=) Y P Daisy's Latet Value y Differece P y MSE ( P ) y Mixed Model Respose Rose (=) P Y y, ad MSE for the Set of Rose's Latet Value y Differece P y MSE ( P ) y Average Source: c09ed33.xls C09ed3v.doc 0/7/009 5:40 PM 6

27 Table 6. Estimators of the Mea Latet Value P = y from a Respose Error Model ad MM, the Differece from the Mea, ad the MSE for the Set of Subects s = ad s = 3. t Mea Respose Y Mea MM Respose Y WLS μ True Ave Latet Value P Differece Mea Respose Y y Differece Mea MM Respose Y P Differece WLS μ P MSE Mea Respose MSE Mea MM Respose MSE WLS ( Y y) ( Y P) ( μ P) Average Source: c09ed33.xls C09ed3v.doc 0/7/009 5:40 PM 7

Sampling, WLS, and Mixed Models Festschrift to Honor Professor Gary Koch

Sampling, WLS, and Mixed Models Festschrift to Honor Professor Gary Koch Samplig, WLS, ad Mixed Models Festschrift to Hoor Professor Gary Koch Edward J. Staek Departmet of Public Health Uiversity of Massachusetts, Amherst, MA 40 Arold House 75 N. Pleasat Street Uiversity of

More information

The standard deviation of the mean

The standard deviation of the mean Physics 6C Fall 20 The stadard deviatio of the mea These otes provide some clarificatio o the distictio betwee the stadard deviatio ad the stadard deviatio of the mea.. The sample mea ad variace Cosider

More information

Properties and Hypothesis Testing

Properties and Hypothesis Testing Chapter 3 Properties ad Hypothesis Testig 3.1 Types of data The regressio techiques developed i previous chapters ca be applied to three differet kids of data. 1. Cross-sectioal data. 2. Time series data.

More information

Basics of Probability Theory (for Theory of Computation courses)

Basics of Probability Theory (for Theory of Computation courses) Basics of Probability Theory (for Theory of Computatio courses) Oded Goldreich Departmet of Computer Sciece Weizma Istitute of Sciece Rehovot, Israel. oded.goldreich@weizma.ac.il November 24, 2008 Preface.

More information

Optimal Estimator for a Sample Set with Response Error. Ed Stanek

Optimal Estimator for a Sample Set with Response Error. Ed Stanek Optial Estiator for a Saple Set wit Respose Error Ed Staek Itroductio We develop a optial estiator siilar to te FP estiator wit respose error tat was cosidered i c08ed63doc Te first 6 pages of tis docuet

More information

It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable.

It should be unbiased, or approximately unbiased. Variance of the variance estimator should be small. That is, the variance estimator is stable. Chapter 10 Variace Estimatio 10.1 Itroductio Variace estimatio is a importat practical problem i survey samplig. Variace estimates are used i two purposes. Oe is the aalytic purpose such as costructig

More information

Optimally Sparse SVMs

Optimally Sparse SVMs A. Proof of Lemma 3. We here prove a lower boud o the umber of support vectors to achieve geeralizatio bouds of the form which we cosider. Importatly, this result holds ot oly for liear classifiers, but

More information

1 Inferential Methods for Correlation and Regression Analysis

1 Inferential Methods for Correlation and Regression Analysis 1 Iferetial Methods for Correlatio ad Regressio Aalysis I the chapter o Correlatio ad Regressio Aalysis tools for describig bivariate cotiuous data were itroduced. The sample Pearso Correlatio Coefficiet

More information

Random Variables, Sampling and Estimation

Random Variables, Sampling and Estimation Chapter 1 Radom Variables, Samplig ad Estimatio 1.1 Itroductio This chapter will cover the most importat basic statistical theory you eed i order to uderstad the ecoometric material that will be comig

More information

(3) If you replace row i of A by its sum with a multiple of another row, then the determinant is unchanged! Expand across the i th row:

(3) If you replace row i of A by its sum with a multiple of another row, then the determinant is unchanged! Expand across the i th row: Math 5-4 Tue Feb 4 Cotiue with sectio 36 Determiats The effective way to compute determiats for larger-sized matrices without lots of zeroes is to ot use the defiitio, but rather to use the followig facts,

More information

Element sampling: Part 2

Element sampling: Part 2 Chapter 4 Elemet samplig: Part 2 4.1 Itroductio We ow cosider uequal probability samplig desigs which is very popular i practice. I the uequal probability samplig, we ca improve the efficiecy of the resultig

More information

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering

CEE 522 Autumn Uncertainty Concepts for Geotechnical Engineering CEE 5 Autum 005 Ucertaity Cocepts for Geotechical Egieerig Basic Termiology Set A set is a collectio of (mutually exclusive) objects or evets. The sample space is the (collectively exhaustive) collectio

More information

Estimation for Complete Data

Estimation for Complete Data Estimatio for Complete Data complete data: there is o loss of iformatio durig study. complete idividual complete data= grouped data A complete idividual data is the oe i which the complete iformatio of

More information

Lecture Overview. 2 Permutations and Combinations. n(n 1) (n (k 1)) = n(n 1) (n k + 1) =

Lecture Overview. 2 Permutations and Combinations. n(n 1) (n (k 1)) = n(n 1) (n k + 1) = COMPSCI 230: Discrete Mathematics for Computer Sciece April 8, 2019 Lecturer: Debmalya Paigrahi Lecture 22 Scribe: Kevi Su 1 Overview I this lecture, we begi studyig the fudametals of coutig discrete objects.

More information

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence

Chapter 3. Strong convergence. 3.1 Definition of almost sure convergence Chapter 3 Strog covergece As poited out i the Chapter 2, there are multiple ways to defie the otio of covergece of a sequece of radom variables. That chapter defied covergece i probability, covergece i

More information

Commutativity in Permutation Groups

Commutativity in Permutation Groups Commutativity i Permutatio Groups Richard Wito, PhD Abstract I the group Sym(S) of permutatios o a oempty set S, fixed poits ad trasiet poits are defied Prelimiary results o fixed ad trasiet poits are

More information

Discrete-Time Systems, LTI Systems, and Discrete-Time Convolution

Discrete-Time Systems, LTI Systems, and Discrete-Time Convolution EEL5: Discrete-Time Sigals ad Systems. Itroductio I this set of otes, we begi our mathematical treatmet of discrete-time s. As show i Figure, a discrete-time operates or trasforms some iput sequece x [

More information

This is an introductory course in Analysis of Variance and Design of Experiments.

This is an introductory course in Analysis of Variance and Design of Experiments. 1 Notes for M 384E, Wedesday, Jauary 21, 2009 (Please ote: I will ot pass out hard-copy class otes i future classes. If there are writte class otes, they will be posted o the web by the ight before class

More information

Infinite Sequences and Series

Infinite Sequences and Series Chapter 6 Ifiite Sequeces ad Series 6.1 Ifiite Sequeces 6.1.1 Elemetary Cocepts Simply speakig, a sequece is a ordered list of umbers writte: {a 1, a 2, a 3,...a, a +1,...} where the elemets a i represet

More information

ECON 3150/4150, Spring term Lecture 3

ECON 3150/4150, Spring term Lecture 3 Itroductio Fidig the best fit by regressio Residuals ad R-sq Regressio ad causality Summary ad ext step ECON 3150/4150, Sprig term 2014. Lecture 3 Ragar Nymoe Uiversity of Oslo 21 Jauary 2014 1 / 30 Itroductio

More information

Improved Class of Ratio -Cum- Product Estimators of Finite Population Mean in two Phase Sampling

Improved Class of Ratio -Cum- Product Estimators of Finite Population Mean in two Phase Sampling Global Joural of Sciece Frotier Research: F Mathematics ad Decisio Scieces Volume 4 Issue 2 Versio.0 Year 204 Type : Double Blid Peer Reviewed Iteratioal Research Joural Publisher: Global Jourals Ic. (USA

More information

Estimation of Gumbel Parameters under Ranked Set Sampling

Estimation of Gumbel Parameters under Ranked Set Sampling Joural of Moder Applied Statistical Methods Volume 13 Issue 2 Article 11-2014 Estimatio of Gumbel Parameters uder Raked Set Samplig Omar M. Yousef Al Balqa' Applied Uiversity, Zarqa, Jorda, abuyaza_o@yahoo.com

More information

Hoggatt and King [lo] defined a complete sequence of natural numbers

Hoggatt and King [lo] defined a complete sequence of natural numbers REPRESENTATIONS OF N AS A SUM OF DISTINCT ELEMENTS FROM SPECIAL SEQUENCES DAVID A. KLARNER, Uiversity of Alberta, Edmoto, Caada 1. INTRODUCTION Let a, I deote a sequece of atural umbers which satisfies

More information

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d

Linear regression. Daniel Hsu (COMS 4771) (y i x T i β)2 2πσ. 2 2σ 2. 1 n. (x T i β y i ) 2. 1 ˆβ arg min. β R n d Liear regressio Daiel Hsu (COMS 477) Maximum likelihood estimatio Oe of the simplest liear regressio models is the followig: (X, Y ),..., (X, Y ), (X, Y ) are iid radom pairs takig values i R d R, ad Y

More information

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4

MATH 320: Probability and Statistics 9. Estimation and Testing of Parameters. Readings: Pruim, Chapter 4 MATH 30: Probability ad Statistics 9. Estimatio ad Testig of Parameters Estimatio ad Testig of Parameters We have bee dealig situatios i which we have full kowledge of the distributio of a radom variable.

More information

Principle Of Superposition

Principle Of Superposition ecture 5: PREIMINRY CONCEP O RUCUR NYI Priciple Of uperpositio Mathematically, the priciple of superpositio is stated as ( a ) G( a ) G( ) G a a or for a liear structural system, the respose at a give

More information

Discrete Mathematics for CS Spring 2008 David Wagner Note 22

Discrete Mathematics for CS Spring 2008 David Wagner Note 22 CS 70 Discrete Mathematics for CS Sprig 2008 David Wager Note 22 I.I.D. Radom Variables Estimatig the bias of a coi Questio: We wat to estimate the proportio p of Democrats i the US populatio, by takig

More information

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A.

Random Walks on Discrete and Continuous Circles. by Jeffrey S. Rosenthal School of Mathematics, University of Minnesota, Minneapolis, MN, U.S.A. Radom Walks o Discrete ad Cotiuous Circles by Jeffrey S. Rosethal School of Mathematics, Uiversity of Miesota, Mieapolis, MN, U.S.A. 55455 (Appeared i Joural of Applied Probability 30 (1993), 780 789.)

More information

Estimation of Population Mean Using Co-Efficient of Variation and Median of an Auxiliary Variable

Estimation of Population Mean Using Co-Efficient of Variation and Median of an Auxiliary Variable Iteratioal Joural of Probability ad Statistics 01, 1(4: 111-118 DOI: 10.593/j.ijps.010104.04 Estimatio of Populatio Mea Usig Co-Efficiet of Variatio ad Media of a Auxiliary Variable J. Subramai *, G. Kumarapadiya

More information

Information-based Feature Selection

Information-based Feature Selection Iformatio-based Feature Selectio Farza Faria, Abbas Kazeroui, Afshi Babveyh Email: {faria,abbask,afshib}@staford.edu 1 Itroductio Feature selectio is a topic of great iterest i applicatios dealig with

More information

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n.

Resampling Methods. X (1/2), i.e., Pr (X i m) = 1/2. We order the data: X (1) X (2) X (n). Define the sample median: ( n. Jauary 1, 2019 Resamplig Methods Motivatio We have so may estimators with the property θ θ d N 0, σ 2 We ca also write θ a N θ, σ 2 /, where a meas approximately distributed as Oce we have a cosistet estimator

More information

Statistics 511 Additional Materials

Statistics 511 Additional Materials Cofidece Itervals o mu Statistics 511 Additioal Materials This topic officially moves us from probability to statistics. We begi to discuss makig ifereces about the populatio. Oe way to differetiate probability

More information

Chapter 9 - CD companion 1. A Generic Implementation; The Common-Merge Amplifier. 1 τ is. ω ch. τ io

Chapter 9 - CD companion 1. A Generic Implementation; The Common-Merge Amplifier. 1 τ is. ω ch. τ io Chapter 9 - CD compaio CHAPTER NINE CD-9.2 CD-9.2. Stages With Voltage ad Curret Gai A Geeric Implemetatio; The Commo-Merge Amplifier The advaced method preseted i the text for approximatig cutoff frequecies

More information

Math 155 (Lecture 3)

Math 155 (Lecture 3) Math 55 (Lecture 3) September 8, I this lecture, we ll cosider the aswer to oe of the most basic coutig problems i combiatorics Questio How may ways are there to choose a -elemet subset of the set {,,,

More information

Linear Regression Demystified

Linear Regression Demystified Liear Regressio Demystified Liear regressio is a importat subject i statistics. I elemetary statistics courses, formulae related to liear regressio are ofte stated without derivatio. This ote iteds to

More information

Summary and Discussion on Simultaneous Analysis of Lasso and Dantzig Selector

Summary and Discussion on Simultaneous Analysis of Lasso and Dantzig Selector Summary ad Discussio o Simultaeous Aalysis of Lasso ad Datzig Selector STAT732, Sprig 28 Duzhe Wag May 4, 28 Abstract This is a discussio o the work i Bickel, Ritov ad Tsybakov (29). We begi with a short

More information

Modified Ratio Estimators Using Known Median and Co-Efficent of Kurtosis

Modified Ratio Estimators Using Known Median and Co-Efficent of Kurtosis America Joural of Mathematics ad Statistics 01, (4): 95-100 DOI: 10.593/j.ajms.01004.05 Modified Ratio s Usig Kow Media ad Co-Efficet of Kurtosis J.Subramai *, G.Kumarapadiya Departmet of Statistics, Podicherry

More information

Some examples of vector spaces

Some examples of vector spaces Roberto s Notes o Liear Algebra Chapter 11: Vector spaces Sectio 2 Some examples of vector spaces What you eed to kow already: The te axioms eeded to idetify a vector space. What you ca lear here: Some

More information

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates

Investigating the Significance of a Correlation Coefficient using Jackknife Estimates Iteratioal Joural of Scieces: Basic ad Applied Research (IJSBAR) ISSN 2307-4531 (Prit & Olie) http://gssrr.org/idex.php?joural=jouralofbasicadapplied ---------------------------------------------------------------------------------------------------------------------------

More information

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen)

Goodness-of-Fit Tests and Categorical Data Analysis (Devore Chapter Fourteen) Goodess-of-Fit Tests ad Categorical Data Aalysis (Devore Chapter Fourtee) MATH-252-01: Probability ad Statistics II Sprig 2019 Cotets 1 Chi-Squared Tests with Kow Probabilities 1 1.1 Chi-Squared Testig................

More information

Understanding Samples

Understanding Samples 1 Will Moroe CS 109 Samplig ad Bootstrappig Lecture Notes #17 August 2, 2017 Based o a hadout by Chris Piech I this chapter we are goig to talk about statistics calculated o samples from a populatio. We

More information

CS284A: Representations and Algorithms in Molecular Biology

CS284A: Representations and Algorithms in Molecular Biology CS284A: Represetatios ad Algorithms i Molecular Biology Scribe Notes o Lectures 3 & 4: Motif Discovery via Eumeratio & Motif Represetatio Usig Positio Weight Matrix Joshua Gervi Based o presetatios by

More information

Advanced Stochastic Processes.

Advanced Stochastic Processes. Advaced Stochastic Processes. David Gamarik LECTURE 2 Radom variables ad measurable fuctios. Strog Law of Large Numbers (SLLN). Scary stuff cotiued... Outlie of Lecture Radom variables ad measurable fuctios.

More information

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008

Chapter 6 Part 5. Confidence Intervals t distribution chi square distribution. October 23, 2008 Chapter 6 Part 5 Cofidece Itervals t distributio chi square distributio October 23, 2008 The will be o help sessio o Moday, October 27. Goal: To clearly uderstad the lik betwee probability ad cofidece

More information

CEU Department of Economics Econometrics 1, Problem Set 1 - Solutions

CEU Department of Economics Econometrics 1, Problem Set 1 - Solutions CEU Departmet of Ecoomics Ecoometrics, Problem Set - Solutios Part A. Exogeeity - edogeeity The liear coditioal expectatio (CE) model has the followig form: We would like to estimate the effect of some

More information

Lecture 19: Convergence

Lecture 19: Convergence Lecture 19: Covergece Asymptotic approach I statistical aalysis or iferece, a key to the success of fidig a good procedure is beig able to fid some momets ad/or distributios of various statistics. I may

More information

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference

t distribution [34] : used to test a mean against an hypothesized value (H 0 : µ = µ 0 ) or the difference EXST30 Backgroud material Page From the textbook The Statistical Sleuth Mea [0]: I your text the word mea deotes a populatio mea (µ) while the work average deotes a sample average ( ). Variace [0]: The

More information

Estimation of the Population Mean in Presence of Non-Response

Estimation of the Population Mean in Presence of Non-Response Commuicatios of the Korea Statistical Society 0, Vol. 8, No. 4, 537 548 DOI: 0.535/CKSS.0.8.4.537 Estimatio of the Populatio Mea i Presece of No-Respose Suil Kumar,a, Sadeep Bhougal b a Departmet of Statistics,

More information

Introductory statistics

Introductory statistics CM9S: Machie Learig for Bioiformatics Lecture - 03/3/06 Itroductory statistics Lecturer: Sriram Sakararama Scribe: Sriram Sakararama We will provide a overview of statistical iferece focussig o the key

More information

o <Xln <X2n <... <X n < o (1.1)

o <Xln <X2n <... <X n < o (1.1) Metrika, Volume 28, 1981, page 257-262. 9 Viea. Estimatio Problems for Rectagular Distributios (Or the Taxi Problem Revisited) By J.S. Rao, Sata Barbara I ) Abstract: The problem of estimatig the ukow

More information

6.3 Testing Series With Positive Terms

6.3 Testing Series With Positive Terms 6.3. TESTING SERIES WITH POSITIVE TERMS 307 6.3 Testig Series With Positive Terms 6.3. Review of what is kow up to ow I theory, testig a series a i for covergece amouts to fidig the i= sequece of partial

More information

Mathematical Induction

Mathematical Induction Mathematical Iductio Itroductio Mathematical iductio, or just iductio, is a proof techique. Suppose that for every atural umber, P() is a statemet. We wish to show that all statemets P() are true. I a

More information

x a x a Lecture 2 Series (See Chapter 1 in Boas)

x a x a Lecture 2 Series (See Chapter 1 in Boas) Lecture Series (See Chapter i Boas) A basic ad very powerful (if pedestria, recall we are lazy AD smart) way to solve ay differetial (or itegral) equatio is via a series expasio of the correspodig solutio

More information

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight)

Tests of Hypotheses Based on a Single Sample (Devore Chapter Eight) Tests of Hypotheses Based o a Sigle Sample Devore Chapter Eight MATH-252-01: Probability ad Statistics II Sprig 2018 Cotets 1 Hypothesis Tests illustrated with z-tests 1 1.1 Overview of Hypothesis Testig..........

More information

Abstract. Ranked set sampling, auxiliary variable, variance.

Abstract. Ranked set sampling, auxiliary variable, variance. Hacettepe Joural of Mathematics ad Statistics Volume (), 1 A class of Hartley-Ross type Ubiased estimators for Populatio Mea usig Raked Set Samplig Lakhkar Kha ad Javid Shabbir Abstract I this paper, we

More information

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1

EECS564 Estimation, Filtering, and Detection Hwk 2 Solns. Winter p θ (z) = (2θz + 1 θ), 0 z 1 EECS564 Estimatio, Filterig, ad Detectio Hwk 2 Sols. Witer 25 4. Let Z be a sigle observatio havig desity fuctio where. p (z) = (2z + ), z (a) Assumig that is a oradom parameter, fid ad plot the maximum

More information

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures

FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING. Lectures FACULTY OF MATHEMATICAL STUDIES MATHEMATICS FOR PART I ENGINEERING Lectures MODULE 5 STATISTICS II. Mea ad stadard error of sample data. Biomial distributio. Normal distributio 4. Samplig 5. Cofidece itervals

More information

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss

ECE 901 Lecture 12: Complexity Regularization and the Squared Loss ECE 90 Lecture : Complexity Regularizatio ad the Squared Loss R. Nowak 5/7/009 I the previous lectures we made use of the Cheroff/Hoeffdig bouds for our aalysis of classifier errors. Hoeffdig s iequality

More information

A statistical method to determine sample size to estimate characteristic value of soil parameters

A statistical method to determine sample size to estimate characteristic value of soil parameters A statistical method to determie sample size to estimate characteristic value of soil parameters Y. Hojo, B. Setiawa 2 ad M. Suzuki 3 Abstract Sample size is a importat factor to be cosidered i determiig

More information

(3) If you replace row i of A by its sum with a multiple of another row, then the determinant is unchanged! Expand across the i th row:

(3) If you replace row i of A by its sum with a multiple of another row, then the determinant is unchanged! Expand across the i th row: Math 50-004 Tue Feb 4 Cotiue with sectio 36 Determiats The effective way to compute determiats for larger-sized matrices without lots of zeroes is to ot use the defiitio, but rather to use the followig

More information

6 Sample Size Calculations

6 Sample Size Calculations 6 Sample Size Calculatios Oe of the major resposibilities of a cliical trial statisticia is to aid the ivestigators i determiig the sample size required to coduct a study The most commo procedure for determiig

More information

An Introduction to Randomized Algorithms

An Introduction to Randomized Algorithms A Itroductio to Radomized Algorithms The focus of this lecture is to study a radomized algorithm for quick sort, aalyze it usig probabilistic recurrece relatios, ad also provide more geeral tools for aalysis

More information

Proof of Goldbach s Conjecture. Reza Javaherdashti

Proof of Goldbach s Conjecture. Reza Javaherdashti Proof of Goldbach s Cojecture Reza Javaherdashti farzijavaherdashti@gmail.com Abstract After certai subsets of Natural umbers called Rage ad Row are defied, we assume (1) there is a fuctio that ca produce

More information

GUIDELINES ON REPRESENTATIVE SAMPLING

GUIDELINES ON REPRESENTATIVE SAMPLING DRUGS WORKING GROUP VALIDATION OF THE GUIDELINES ON REPRESENTATIVE SAMPLING DOCUMENT TYPE : REF. CODE: ISSUE NO: ISSUE DATE: VALIDATION REPORT DWG-SGL-001 002 08 DECEMBER 2012 Ref code: DWG-SGL-001 Issue

More information

On stratified randomized response sampling

On stratified randomized response sampling Model Assisted Statistics ad Applicatios 1 (005,006) 31 36 31 IOS ress O stratified radomized respose samplig Jea-Bok Ryu a,, Jog-Mi Kim b, Tae-Youg Heo c ad Chu Gu ark d a Statistics, Divisio of Life

More information

4.3 Growth Rates of Solutions to Recurrences

4.3 Growth Rates of Solutions to Recurrences 4.3. GROWTH RATES OF SOLUTIONS TO RECURRENCES 81 4.3 Growth Rates of Solutios to Recurreces 4.3.1 Divide ad Coquer Algorithms Oe of the most basic ad powerful algorithmic techiques is divide ad coquer.

More information

Session 5. (1) Principal component analysis and Karhunen-Loève transformation

Session 5. (1) Principal component analysis and Karhunen-Loève transformation 200 Autum semester Patter Iformatio Processig Topic 2 Image compressio by orthogoal trasformatio Sessio 5 () Pricipal compoet aalysis ad Karhue-Loève trasformatio Topic 2 of this course explais the image

More information

Lecture 2: Monte Carlo Simulation

Lecture 2: Monte Carlo Simulation STAT/Q SCI 43: Itroductio to Resamplig ethods Sprig 27 Istructor: Ye-Chi Che Lecture 2: ote Carlo Simulatio 2 ote Carlo Itegratio Assume we wat to evaluate the followig itegratio: e x3 dx What ca we do?

More information

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara

Econ 325 Notes on Point Estimator and Confidence Interval 1 By Hiro Kasahara Poit Estimator Eco 325 Notes o Poit Estimator ad Cofidece Iterval 1 By Hiro Kasahara Parameter, Estimator, ad Estimate The ormal probability desity fuctio is fully characterized by two costats: populatio

More information

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator

Economics 241B Relation to Method of Moments and Maximum Likelihood OLSE as a Maximum Likelihood Estimator Ecoomics 24B Relatio to Method of Momets ad Maximum Likelihood OLSE as a Maximum Likelihood Estimator Uder Assumptio 5 we have speci ed the distributio of the error, so we ca estimate the model parameters

More information

Topics in Probability Theory and Stochastic Processes Steven R. Dunbar. Stirling s Formula Derived from the Gamma Function

Topics in Probability Theory and Stochastic Processes Steven R. Dunbar. Stirling s Formula Derived from the Gamma Function Steve R. Dubar Departmet of Mathematics 23 Avery Hall Uiversity of Nebraska-Licol Licol, NE 68588-3 http://www.math.ul.edu Voice: 42-472-373 Fax: 42-472-8466 Topics i Probability Theory ad Stochastic Processes

More information

Chapter 6 Sampling Distributions

Chapter 6 Sampling Distributions Chapter 6 Samplig Distributios 1 I most experimets, we have more tha oe measuremet for ay give variable, each measuremet beig associated with oe radomly selected a member of a populatio. Hece we eed to

More information

TEACHER CERTIFICATION STUDY GUIDE

TEACHER CERTIFICATION STUDY GUIDE COMPETENCY 1. ALGEBRA SKILL 1.1 1.1a. ALGEBRAIC STRUCTURES Kow why the real ad complex umbers are each a field, ad that particular rigs are ot fields (e.g., itegers, polyomial rigs, matrix rigs) Algebra

More information

G. R. Pasha Department of Statistics Bahauddin Zakariya University Multan, Pakistan

G. R. Pasha Department of Statistics Bahauddin Zakariya University Multan, Pakistan Deviatio of the Variaces of Classical Estimators ad Negative Iteger Momet Estimator from Miimum Variace Boud with Referece to Maxwell Distributio G. R. Pasha Departmet of Statistics Bahauddi Zakariya Uiversity

More information

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise)

Lecture 22: Review for Exam 2. 1 Basic Model Assumptions (without Gaussian Noise) Lecture 22: Review for Exam 2 Basic Model Assumptios (without Gaussia Noise) We model oe cotiuous respose variable Y, as a liear fuctio of p umerical predictors, plus oise: Y = β 0 + β X +... β p X p +

More information

7.1 Convergence of sequences of random variables

7.1 Convergence of sequences of random variables Chapter 7 Limit Theorems Throughout this sectio we will assume a probability space (, F, P), i which is defied a ifiite sequece of radom variables (X ) ad a radom variable X. The fact that for every ifiite

More information

Topic 5: Basics of Probability

Topic 5: Basics of Probability Topic 5: Jue 1, 2011 1 Itroductio Mathematical structures lie Euclidea geometry or algebraic fields are defied by a set of axioms. Mathematical reality is the developed through the itroductio of cocepts

More information

Expectation and Variance of a random variable

Expectation and Variance of a random variable Chapter 11 Expectatio ad Variace of a radom variable The aim of this lecture is to defie ad itroduce mathematical Expectatio ad variace of a fuctio of discrete & cotiuous radom variables ad the distributio

More information

SNAP Centre Workshop. Basic Algebraic Manipulation

SNAP Centre Workshop. Basic Algebraic Manipulation SNAP Cetre Workshop Basic Algebraic Maipulatio 8 Simplifyig Algebraic Expressios Whe a expressio is writte i the most compact maer possible, it is cosidered to be simplified. Not Simplified: x(x + 4x)

More information

THE KALMAN FILTER RAUL ROJAS

THE KALMAN FILTER RAUL ROJAS THE KALMAN FILTER RAUL ROJAS Abstract. This paper provides a getle itroductio to the Kalma filter, a umerical method that ca be used for sesor fusio or for calculatio of trajectories. First, we cosider

More information

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients.

Definitions and Theorems. where x are the decision variables. c, b, and a are constant coefficients. Defiitios ad Theorems Remember the scalar form of the liear programmig problem, Miimize, Subject to, f(x) = c i x i a 1i x i = b 1 a mi x i = b m x i 0 i = 1,2,, where x are the decisio variables. c, b,

More information

Sequences, Mathematical Induction, and Recursion. CSE 2353 Discrete Computational Structures Spring 2018

Sequences, Mathematical Induction, and Recursion. CSE 2353 Discrete Computational Structures Spring 2018 CSE 353 Discrete Computatioal Structures Sprig 08 Sequeces, Mathematical Iductio, ad Recursio (Chapter 5, Epp) Note: some course slides adopted from publisher-provided material Overview May mathematical

More information

ESTIMATION AND PREDICTION BASED ON K-RECORD VALUES FROM NORMAL DISTRIBUTION

ESTIMATION AND PREDICTION BASED ON K-RECORD VALUES FROM NORMAL DISTRIBUTION STATISTICA, ao LXXIII,. 4, 013 ESTIMATION AND PREDICTION BASED ON K-RECORD VALUES FROM NORMAL DISTRIBUTION Maoj Chacko Departmet of Statistics, Uiversity of Kerala, Trivadrum- 695581, Kerala, Idia M. Shy

More information

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ

If, for instance, we were required to test whether the population mean μ could be equal to a certain value μ STATISTICAL INFERENCE INTRODUCTION Statistical iferece is that brach of Statistics i which oe typically makes a statemet about a populatio based upo the results of a sample. I oesample testig, we essetially

More information

1 Hash tables. 1.1 Implementation

1 Hash tables. 1.1 Implementation Lecture 8 Hash Tables, Uiversal Hash Fuctios, Balls ad Bis Scribes: Luke Johsto, Moses Charikar, G. Valiat Date: Oct 18, 2017 Adapted From Virgiia Williams lecture otes 1 Hash tables A hash table is a

More information

Probability, Expectation Value and Uncertainty

Probability, Expectation Value and Uncertainty Chapter 1 Probability, Expectatio Value ad Ucertaity We have see that the physically observable properties of a quatum system are represeted by Hermitea operators (also referred to as observables ) such

More information

Feedback in Iterative Algorithms

Feedback in Iterative Algorithms Feedback i Iterative Algorithms Charles Byre (Charles Byre@uml.edu), Departmet of Mathematical Scieces, Uiversity of Massachusetts Lowell, Lowell, MA 01854 October 17, 2005 Abstract Whe the oegative system

More information

Statistical inference: example 1. Inferential Statistics

Statistical inference: example 1. Inferential Statistics Statistical iferece: example 1 Iferetial Statistics POPULATION SAMPLE A clothig store chai regularly buys from a supplier large quatities of a certai piece of clothig. Each item ca be classified either

More information

A Relationship Between the One-Way MANOVA Test Statistic and the Hotelling Lawley Trace Test Statistic

A Relationship Between the One-Way MANOVA Test Statistic and the Hotelling Lawley Trace Test Statistic http://ijspccseetorg Iteratioal Joural of Statistics ad Probability Vol 7, No 6; 2018 A Relatioship Betwee the Oe-Way MANOVA Test Statistic ad the Hotellig Lawley Trace Test Statistic Hasthika S Rupasighe

More information

MOMENT-METHOD ESTIMATION BASED ON CENSORED SAMPLE

MOMENT-METHOD ESTIMATION BASED ON CENSORED SAMPLE Vol. 8 o. Joural of Systems Sciece ad Complexity Apr., 5 MOMET-METHOD ESTIMATIO BASED O CESORED SAMPLE I Zhogxi Departmet of Mathematics, East Chia Uiversity of Sciece ad Techology, Shaghai 37, Chia. Email:

More information

Algebra of Least Squares

Algebra of Least Squares October 19, 2018 Algebra of Least Squares Geometry of Least Squares Recall that out data is like a table [Y X] where Y collects observatios o the depedet variable Y ad X collects observatios o the k-dimesioal

More information

Distribution of Random Samples & Limit theorems

Distribution of Random Samples & Limit theorems STAT/MATH 395 A - PROBABILITY II UW Witer Quarter 2017 Néhémy Lim Distributio of Radom Samples & Limit theorems 1 Distributio of i.i.d. Samples Motivatig example. Assume that the goal of a study is to

More information

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution

Double Stage Shrinkage Estimator of Two Parameters. Generalized Exponential Distribution Iteratioal Mathematical Forum, Vol., 3, o. 3, 3-53 HIKARI Ltd, www.m-hikari.com http://dx.doi.org/.9/imf.3.335 Double Stage Shrikage Estimator of Two Parameters Geeralized Expoetial Distributio Alaa M.

More information

Chapter 13, Part A Analysis of Variance and Experimental Design

Chapter 13, Part A Analysis of Variance and Experimental Design Slides Prepared by JOHN S. LOUCKS St. Edward s Uiversity Slide 1 Chapter 13, Part A Aalysis of Variace ad Eperimetal Desig Itroductio to Aalysis of Variace Aalysis of Variace: Testig for the Equality of

More information

Statistical Inference Based on Extremum Estimators

Statistical Inference Based on Extremum Estimators T. Rotheberg Fall, 2007 Statistical Iferece Based o Extremum Estimators Itroductio Suppose 0, the true value of a p-dimesioal parameter, is kow to lie i some subset S R p : Ofte we choose to estimate 0

More information

Analysis of the Chow-Robbins Game with Biased Coins

Analysis of the Chow-Robbins Game with Biased Coins Aalysis of the Chow-Robbis Game with Biased Cois Arju Mithal May 7, 208 Cotets Itroductio to Chow-Robbis 2 2 Recursive Framework for Chow-Robbis 2 3 Geeralizig the Lower Boud 3 4 Geeralizig the Upper Boud

More information

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation

II. Descriptive Statistics D. Linear Correlation and Regression. 1. Linear Correlation II. Descriptive Statistics D. Liear Correlatio ad Regressio I this sectio Liear Correlatio Cause ad Effect Liear Regressio 1. Liear Correlatio Quatifyig Liear Correlatio The Pearso product-momet correlatio

More information

A UNIFIED APPROACH TO ESTIMATION AND PREDICTION UNDER SIMPLE RANDOM SAMPLING

A UNIFIED APPROACH TO ESTIMATION AND PREDICTION UNDER SIMPLE RANDOM SAMPLING A UIFIED APPROACH TO ETIMATIO AD PREDICTIO UDER IMPLE RADOM AMPLIG Edward J. taek III Departmet of Biostatistics ad Epidemiology, PH Uiversity of Massachusetts at Amherst, UA Julio da Motta iger Departameto

More information

The Random Walk For Dummies

The Random Walk For Dummies The Radom Walk For Dummies Richard A Mote Abstract We look at the priciples goverig the oe-dimesioal discrete radom walk First we review five basic cocepts of probability theory The we cosider the Beroulli

More information

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT

Geometry of LS. LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT OCTOBER 7, 2016 LECTURE 3 GEOMETRY OF LS, PROPERTIES OF σ 2, PARTITIONED REGRESSION, GOODNESS OF FIT Geometry of LS We ca thik of y ad the colums of X as members of the -dimesioal Euclidea space R Oe ca

More information