A COMPARISON OF SMALL AREA AND CALIBRATION ESTIMATORS VIA SIMULATION

Size: px
Start display at page:

Download "A COMPARISON OF SMALL AREA AND CALIBRATION ESTIMATORS VIA SIMULATION"

Transcription

1 SAISICS IN RANSIION new series an SURVEY MEHODOLOGY 133 SAISICS IN RANSIION new series an SURVEY MEHODOLOGY Joint Issue: Small Area Estimation 014 Vol. 17, No. 1, pp A COMPARISON OF SMALL AREA AND CALIBRAION ESIMAORS VIA SIMULAION M. A. Hiiroglou 1, V. M. Estevao ABSRAC Domain estimates are typically obtaine using calibration estimators that are irect or moifie irect. hey are irect if they strictly use ata within the omain of interest. hey are moifie irect if they use both ata within an outsie the omain of interest. An alternative way of proucing these estimates is through small area proceures. In this article, we compare the performance of these two approaches via a simulation. he population is generate using a hierarchical moel that inclues both area effects an unit level ranom errors. he population is mae up of mutually exclusive omains of ifferent sizes, ranging from a small number of units to a large number of units. We select many inepenent simple ranom samples of fixe size from the population an compute various estimates for each sample using the available auxiliary information. he estimates compute for the simulation inclue the Horvitz-hompson estimator, the synthetic estimator (inirect estimate), calibration estimators, an unit level base estimators (small area estimate). he performance of these estimators is summarize base on their esign- base properties. Key wors: area level, unit level, calibration estimates, small area estimates, simulation. 1. Introuction Domain estimates at Statistics Canaa are typically obtaine using well-establishe methos base on calibration estimation. he calibration is irect or moifie irect. It is irect if it is base on ata within the omain of interest. It is moifie irect if it is base on ata within an outsie the omain of interest. hese methos can be viewe as esign-base proceures as the variance of the resulting estimators is evaluate uner the ranomization istribution. he 1 Michael A. Hiiroglou, Statistical Research an Innovation Division, Statistics Canaa, 16 D, R.-H.-Coats Builing, Ottawa, Ontario, Canaa, K1A hiirog@yahoo.ca. Victor M. Estevao, Statistical Research an Innovation Division, Statistics Canaa, 16 D, R.-H.-Coats Builing, Ottawa, Ontario, Canaa, K1A Victor.Estevao@statcan.gc.ca.

2 134 M. A. Hiiroglou, V. M. Estevao: A Comparison of small... ranomization istribution of an estimator is the istribution over all possible samples that coul be selecte from the target population of interest uner the sampling esign use to select the sample, with the population parameters consiere as fixe values. Another way of proucing these estimates is through small area methos. hese methos are particularly important when the sample size in the omains is small. hey can improve the reliability of the irect estimates provie that the variable of interest is well correlate with auxiliary variables x that are available from aministrative or other files. Small area estimation essentially combines irect estimates with moel-base estimates in an optimal manner. he moel-base estimates involve known population totals (auxiliary ata) an estimates of the regression between the variable of interest an the auxiliary ata across the small areas. In general, these moels are classifie into two groups: unit level moels an area level moels. Unit level moels are generally base on observation units (e.g., persons or companies) from the survey an auxiliary variables associate with each observation, whereas area level moels are base on irect survey estimates aggregate from the unit level ata an relate area level auxiliary variables; see Rao (003) for an overview of small area moels. he more recent literature that covers empirical assessment of the properties of various small area an omain estimators inclues Lehtonen an Veianen (009), Datta (009), an Pfeffermann (013). Lehtonen an Veianen (009) focuse on esign-base methos (calibration an regression) using auxiliary ata. hey reviewe the work on the extension of the linear form of the Generalize Regression Estimator (GREG) given in Särnal et al. (199) to inclue logistic, multinomial logistic an mixe moels for omain estimation. Datta (009) reviewe the evelopment of moel-base proceures to obtain small area estimates. Datta focusse in particular on the theoretical properties of the resulting estimators. Pfeffermann (013) reviewe both esign-base an moel-base proceures, as well as recent evelopments in these two proceures. Domain estimates are currently obtaine via esign-base proceures at Statistics Canaa. However, the increasing requirement for proucing estimates for "small omains" has encourage the nee to aopt moel-base proceures. A SAS-base prototype (Estevao et al. 014) has been recently evelope at Statistics Canaa to respon to these requirements. he prototype currently incorporates two well-known methos initially evelope by Fay an Herriot (1979) for area level estimation, an Battese, Harter, an Fuller (1988) for unit level estimation. Although the theoretical properties of the estimators inclue in the prototype are known, they were investigate via a simulation. In the simulations, we looke at the properties of estimators of omain totals. We compare moel-base small area estimators with traitional estimators through simulation. he latter inclue the Horvitz-hompson estimator, two calibration estimators, the moifie regression estimator an the synthetic estimator. he small area estimators are the EBLUP an Pseuo EBLUP estimators base on a unit level moel. More etails on all of these estimators are given in section.

3 SAISICS IN RANSIION new series an SURVEY MEHODOLOGY 135 he simulation setup an results are reporte in section 3. Section 4 provies a few conclusions from our finings.. Sample esign Large scale surveys are esigne to satisfy reliability requirements for some subsets (omains) of the population. Examples of these subsets inclue partitions below the level of the initial geographical / inustrial etail requeste by the client. If such subsets are require before the sample is selecte, then such omains are labelle as planne omains (Singh, Gambino an Mantel 1994). Such planne omains will have some of the sample allocate to them to obtain unbiase estimates with the require precision using irect estimation proceures. If these omains are ientifie after the sample has been selecte, they will be known as unplanne omains. Note that, in any event, unplanne omains will exist for most surveys. An example taken from business surveys is a change of inustry uring ata collection. A business initially classifie as inustry A becomes inustry B. Such a business woul be tabulate as part of the businesses of type B, but woul retain its original sampling weight. Another example, taken from househol surveys, woul be the arbitrary prouction of estimates below a geographical level that was not part of the allocation process of the sample. raitional or small area estimators can be use for either planne or unplanne omains. As omain estimation for most surveys at Statistics Canaa is mostly of the unplanne type, we have esigne our simulation to reflect this tenency: that is no units are allocate to them prior to sample selection. Domain estimates are prouce after sample selection, an the number of sample units falling in each omain is a ranom variable. Our simulation reflects this point, an we use the simplest sample esign to carry it out. We rew repeate samples s of size n from the population U of size N using simple ranom sampling without replacement. he weight associate with unit U is enote as w. Let s, 1,,..., D, be the portion of the sample s that overlaps with omain U (of known size N ). Let the realize sample size in omain U be n. he survey esign weight associate with a unit U is w. he ata in the population are enote as y, x for each element U. he y variable is the one of interest, while x is the vector of auxiliary ata. Computation of omain statistics can be obtaine using the operators (i.e.: mean an variance) in regular estimation via the following transformation. In omain U, we enote the variable of interest as y where y y if U an 0 otherwise. he associate vector of auxiliary variables is efine as x where x x if U an 0 otherwise.

4 136 M. A. Hiiroglou, V. M. Estevao: A Comparison of small... he obective of the present stuy is to compare the properties of moel-base small area estimators for omains with those traitionally use in survey estimation. We consiere seven estimators of the omain total Y U y : four are traitional estimators an three are small area estimators. We first present the traitional estimators..1. raitional estimators Horvitz-hompson: he Horvitz-hompson estimator Y ˆ H, 1,,..., D, uses no auxiliary information. It is efine as Y w y if n 0 an 0 ˆ H s otherwise. We set Y ˆ H to 0 if there are no sample units in the omain, ensuring unbiase estimation over all samples s rawn from U. Although this estimator is unbiase, it prouces inefficient estimates. Calibration Estimators: We consier two calibration estimators, Y ˆ CALU an Y ˆ CALU, that use auxiliary information at ifferent levels. hey are applications of calibration given in Deville an Särnal (199) aapte to omain estimation. he irect estimator Y ˆ CALU uses auxiliary information at the omain level, while the moifie irect estimatory ˆ CALU uses information at the population level. Estimator Y ˆ CALU is known to be more efficient than Y ˆ CALU. However, estimator Y ˆ CALU has some rawbacks. It is not always possible to obtain auxiliary information at the omain level. Even if this information is available, we cannot prouce estimates using Y ˆ CALU if there are no sample units in the omain. Furthermore, this estimator can prouce erratic values when there are only a few units in the omain. o prevent this, we nee to make sure that the number of units in the omain is larger than the number of auxiliary variables. As a minimal requirement, given that there are two auxiliary variables (intercept, x), Y ˆ CALU can be estimate only if there are 3 or more units in a omain. Otherwise, we cannot prouce a value, an we set it to missing. his means that we only work with a subset of all possible samples. If we set the value of Y ˆ CALU to 0 when there is an insufficient number of observations n in omainu, this woul result in a biase estimator. As for Y ˆ CALU, when there are no sample units in the omain, we set the value of this estimator to 0. his ensures that it is

5 SAISICS IN RANSIION new series an SURVEY MEHODOLOGY 137 approximately esign unbiase for the omain total. Estimator 1,,..., D, is given by: Y ˆ CALU, ( ˆ w ) ˆ y H CALU if n 3 ˆ X X β Y s CALU. (missing) if n 3 with X x, ˆ X H w x an U s 1 w ˆ w y β CALU x x s c x. s c Estimator Y ˆ CALU, 1,,..., D, is given by: with Yˆ CALU ( ˆ w ) ˆ y H CALU n 0 X X β if s 0 if n 0 X x, ˆ H U s X w x an 1 w ˆ x x w x y β CALU s c. s c Moifie Regression (REG): he moifie regression estimator Y ˆ REG, 1,,..., D, is ue to Wooruff (1966). It is of interest as it was use to prouce area breakowns of the monthly national estimates of US Census Bureau retail trae survey. Note that it is a moifie irect estimator. Singh an Mian (003) points out that it can be viewe as a calibration estimator s w y, where the calibration weight w is obtaine by minimizing the chi-square istance / x X : here a is s c w a w w, subect to the constraints s w the omain inicator variable. Estimator Y ˆ REG is esign-unbiase as the overall sample size increases. It is given by: Yˆ REG ( ˆ w ) ˆ y H REG if n 0 X X β s ˆ X βreg if n 0

6 138 M. A. Hiiroglou, V. M. Estevao: A Comparison of small... with X x, ˆ X H w x an U s 1 w ˆ x x w x y β REG s c. s c he c term, c 0, associate with the estimators that use auxiliary ata reflects that the error terms inepenently with mean zero an variance c... Small area estimators β ˆ e in the implie working moel are istribute he simplest small area estimator is the synthetic estimator (SYN), Y ˆ SYN, 1,,..., D. It is given by ˆ Y ˆ SYN X β SYN where X U x an SYN s Bias ( Yˆ ) 1 x x x w w y c s c SYN Y regression vector. X B, where e. his estimator is esign-biase, given by 1 x y x x B U c is the population U c he next two small area estimators are base on a hierarchical moel given by: y x β v e, (1) where ii v ii e v N(0, ), e N(0, c ), an caccounts for possible heterogeneity of the e resiuals. In our application of this moel, the areas are our omains of interest. he quantity xβ is the fixe effect which is assume to be a linear combination of the auxiliary variables x i. he resiuals v an e are respectively the ranom effect for the area an the ranom errors for unit in area. he term translates to a c in the various formulas that follow. c

7 SAISICS IN RANSIION new series an SURVEY MEHODOLOGY 139 Empirical Best Linear Unbiase Preictor (EBLUP): his estimator enote Y, 1,,..., D, is given in Rao (003, p.136). It is an extension of the as ˆ EBLUP Battese, Harter, an Fuller (1988) estimator when the error structure of the resiuals is not homogeneous. It is given by: Yˆ EBLUP { ˆ N ˆ ( ˆ X βeblup a ya xa βeblup )} if n 0 ˆ X βeblup if n 0 he terms making up Y ˆ EBLUP inclue N, terms are efine as follows: ˆ a ˆ v v ˆe a ˆ given by: y a where 1 a X, ˆa a y s s a s a ya, x a x a, an β ˆEBLUP. hese s a s a x, an. he estimate regression vector is 1 D D EBLUP a ( a a ) a ( a a ) y 1 s 1 s βˆ x ˆ x x x ˆ x. his estimator is not esign consistent, unless the sampling esign is selfweighting. Pseuo-EBLUP (PEBLUP): his estimator enote as Y ˆ PEBLUP, 1,,..., D, is an extension of the Pseuo-EBLUP estimator given in You an Rao (00). It accounts for the heterogeneity of the e resiuals in moel (1). It inclues the survey weights w, s, in the regression coefficient an the parameter estimate. { ˆ ˆ ( )} 0 ˆ N X βpeblup w yw xw yw if n Y PEBLUP ˆ X βpeblup if n 0 he terms making up Y ˆ PEBLUP inclue N, X, w hese terms are efine as follows: ˆ w ˆ v v ˆe w ˆ, where w y w s s w s w s a y, x w wy w,, y w, an β ˆPEBLUP. s w x x w, w for 1,,..., D. s

8 140 M. A. Hiiroglou, V. M. Estevao: A Comparison of small... he estimate regression vector is given by: 1 D D PEBLUP w a ( wa wa ) i w a ( wa wa ) y 1 s 1 s βˆ x ˆ x x x ˆ x s w a x ˆ with x wa an ˆ v wa w a ˆ ˆ where wa s s s ( wa ) a wa his estimator is esign consistent.. v e wa 3. Simulation Surveys prouce at Statistics Canaa can be as simple as stratifie one stage simple ranom sampling esigns typically use for business surveys to the more complex stratifie multi-stage esign with unequal selection probabilities at each stage typically use for househol surveys. We opte for a single stage simple ranom sample selecte from the population, as it is a simplification of the sample esigns use for business surveys. Ha we chosen a sampling esign with unequal weights, we woul have ha to account for the possible impact of informative sampling on the small area estimators using the proceure given in Pfeffermann an Sverchkov (007). Verret, Rao, an Hiiroglou (015) use a simpler proceure than the one given in Pfeffermann an Sverchkov (007). heir proceure accounte for unequal selection probabilities for moel-base small area estimators by incorporating them into the moel. heir simulation use a esign-moel (pm) approach. heir results showe that incorporating the unequal selection probabilities significantly improve the performance (average absolute bias an average RMSE) of EBLUP, but ha marginal impact on PEBLUP. 3.1 Population Generation an Sample Selection A population U consisting of 4,640 units was create by generating ata ( x, y ) for three separate subsets of the population (groups) with ifferent i i intercepts an slopes. Each group was split into mutually exclusive an exhaustive omains as follows: Group 1 was split into nine omains U 1,..., U 9 ; Group was split into ten omains U10,..., U 19 ; an Group 3 was split into ten omains U0,..., U 9. he three groups resulte in a total of D=9 omains that were mutually exclusive an exhaustive. he number of units in each omain, N, was

9 SAISICS IN RANSIION new series an SURVEY MEHODOLOGY 141 allocate in a monotonic manner: omain U 1 ha 0 units; omain U ha 30 units; an omain U 9 ha 300 units. In our simulation the auxiliary ata x consiste of two auxiliary variables. he first one ha the fixe value of one to represent the intercept in the moel. he secon one, x, represente the available auxiliary ata in the population. he auxiliary variable x in each group was generate from a Gamma ( 5, 10) istribution with mean 50 an variance 500. he variable of interest y was generate using the moel y 0, 1, x v e : =1,,3 () where ii v v N(0, ) an ii e e N(0, c ). We use v e an set c equal to x. he following table summarizes how the population was split into the three groups of omains. able 1. Groups, associate omains an regression parameters Group ( ) Domains in Group 0, 1, 1 U for = 1,..., U for =10,..., U for =0,..., A plot of the generate population is shown in Figure 1. he units in the groups are shown respectively in green, blue an yellow. he three regression lines are shown in re. Without the colours to ientify the groups, one might be incline to think that the population was generate uner a moel with a single auxiliary variable (one intercept an slope) as shown in the inset. Figure 1. Plot of y vs. x for population in the simulation stuy

10 14 M. A. Hiiroglou, V. M. Estevao: A Comparison of small... We ran two separate simulation runs to reflect that two possible moels coul be fitte for the selecte samples. We enote these simulations as runs 1 an. In the first run (simulation run 1), we assume that the moel coul be fitte using x (1, x ) as auxiliary ata; this is not correct as the population was generate on the basis of three ifferent regressions. In the secon run (simulation run ), we acknowlege that there were three separate moels an use a set of auxiliary variables reflecting the manner in which the population values were generate; this fit is correct. his meant using a set of ummy-coe auxiliary variables efine as follows for each unit: x 1,0,0, x,0,0 if U group 1 0,1,0,0, x,0 if U group 0,0,1,0,0, x if U group 3 In the small area estimation moel given by equation (1), the use of this implies the following regression coefficient 1,, 3, 4, 5, 6 (3) x β for the fixe effects. For the synthetic estimator an the calibration estimators, we set c x to reflect the heterogeneity of the moel errors. i i Each simulation run involve the selection of R=100,000 inepenent samples an the computation of various estimates for each sample. Each sample was a simple ranom sample s of size n selecte without replacement from U. We use sample sizes n=3 (5%), n=464 (10%), n=696 (15%) an n=98 (0%), where the sampling fractions are inicate in brackets. hese are within the range of the sampling fractions typically use by business surveys. he sample units in omain U are enote by s with s 1 s. We D observe n units in U where 0 n N an n 1 n. Uner simple ranom sampling without replacement, the n follow a multivariate D N N hypergeometric istribution with probability mass function 1. n n he following table shows the probability of observing n 0, n 1 or n in the three smallest omains when the sample size n is 3. D

11 SAISICS IN RANSIION new series an SURVEY MEHODOLOGY 143 able 4. Probabilities in the 3 smallest omains when n = 3 Probability U 1 with N1 0 U with N 30 U 3 with N3 40 Prob ( n 0) Prob ( n 1) Prob ( n ) Using table 4, for the smallest omain U 1, Y ˆ H an Y ˆ CALU woul be equal to zero about 36% of the time. Note that this probability ecreases rapily as the omain population size N increases. Since we require n 3, we cannot prouce an estimate for Y ˆ CALU in approximately 9.5% of the samples selecte in the smallest omain U 1. his probability ecreases rapily as the omain population size N increases. 3.. Simulation statistics For each selecte sample in each simulation run r = 1,...,R (R=100,000), we ( ) compute estimates of Y for the seven estimators. Denote Y ˆ r ES as the estimate th prouce for the r sample, r 1,,... R, where the subscript ES is a placeholer for any one of the seven estimators. For each omain =1,...,9, we compute the bias as: an the mean square error as Bias ˆ 1 ( Y ) R Y ˆ Y ES R ( r) r1 ES R ˆ r r 1 ES MSE ( Y ˆ ) R 1 Y ( ) Y. ES For each estimator, Y ˆ ES, we also compute the following summary statistics across all omains an simulate samples. hese were the average absolute relative bias, the average coefficient of variation an the average relative efficiency enote as ARB( Y ˆ ), CV ( Y ˆ ) an RE( Y ˆ ) respectively. ES ES ES

12 144 M. A. Hiiroglou, V. M. Estevao: A Comparison of small... hese were compute as follows: ˆ 1 D ARB( Y ) ˆ ˆ ES 1 ARB( Y ES ) where ARB( Y ES ) D ˆ 1 D CV ( Y ) ˆ ˆ ES 1CV ( Y ES ) where CV ( Y ES ) D Bias( Yˆ ) ES MSE( Yˆ ) ( ˆ ˆ MSE Y ) ˆ 1 RE( Y ) where MSE( Y ) MSE( Yˆ ) he statistic H D ES ˆ ES 1 ES MSE( YES ) D Y Y ES RE( Y ˆ ) measures the average efficiency of each estimator ES relative to the Horvitz-hompson estimator. Since Y ˆ H is known to have the least efficiency among these seven estimators, this measure is a number larger than or equal to Simulation Results ables 5, 6 an 7 show the ifferences between the two runs using the summary statistics escribe in the previous section. he results are iscusse after each of these three tables for runs 1 an. able 5. Average Absolute Relative Bias ARB( Y ˆ ) ES (4) raitional Domain Estimators Small Area Estimators Sample Size Run Y ˆ ˆH Y Y CALU ˆ CALU Y ˆ REG Y ˆSYN Y ˆEBLUP Y ˆPEBLUP

13 SAISICS IN RANSIION new series an SURVEY MEHODOLOGY 145 Moel oes not fit (Run 1): he small area estimators Y ˆSYN, Y ˆEBLUP, an Y ˆPEBLUP have the largest ARB s. In particular, Y ˆSYN has the highest ARB. he ARB ecreases as the sample size increases for Y ˆEBLUP an Y ˆPEBLUP, whereas it remains constant (as expecte) for Y ˆSYN. he ARB associate with Y ˆPEBLUP ecreases more rapily than the one associate with Y ˆEBLUP as the sample size increases. he ARB associate with the traitional omain estimators is quite small: it also ecreases as the sample size increases. Moel fits (Run ): he ARB s associate with the small area estimators have significantly ecrease. However, they are still higher than those associate with the traitional omain estimators. Y ˆ REG has the smallest ARB amongst all the estimators. As note in run 1, the ARB ecreases as the sample size increases for all the estimators. able 6. Average Coefficient of Variation CV ( Y ˆ ) ES raitional Domain Estimators Small Area Estimators Sample Size Run Y ˆ ˆH Y CALU Y ˆ CALU Y ˆ REG Y ˆSYN Y ˆEBLUP Y ˆPEBLUP Moel oes not fit (Run 1): Estimators Y ˆH an Y ˆ CALU have the highest CV among all estimators; their CV s are quite comparable, implying that auxiliary

14 146 M. A. Hiiroglou, V. M. Estevao: A Comparison of small... ata use at the population level in Y ˆ CALU has no impact on improving the reliability of the estimator at the omain level. he synthetic estimator Y ˆSYN also has a high CV that remains constant no matter what the sample size is. he calibration estimator Y ˆ CALU has the lowest CV for all sample sizes. he ranking from low to high of the remaining three estimators is Y ˆ PEBLUP, Y ˆEBLUP an Y ˆ REG. Note that the CV of Y ˆ REG ecreases quite rapily as compare to the other estimators. he reliability of all the estimators improves as the sample size increases. Moel fits (Run ): he CV s are smaller than those obtaine in run 1 for all estimators except for Y ˆH an Y ˆ CALU. his is expecte as both estimators o not profit from the auxiliary ata. hese two estimators are still the ones with the highest CV s. As expecte, because the moel fits well, all three small area estimators Y ˆSYN, Y ˆEBLUP an Y ˆPEBLUP have reasonable CV s. he moifie regression estimator Y ˆ REG performs better than the calibration at the omain level Y ˆ CALU : the reverse was true when the moel was incorrect (run 1). able 7. Average Relative Efficiency Sample Size Run raitional Domain Estimators Y ˆ ˆH RE( Y ˆ ) ES Small Area Estimators Y CALU Y ˆ CALU Y ˆ REG Y ˆSYN Y ˆEBLUP Y ˆPEBLUP Note: he higher the number the more efficient the estimator relative to the H estimator. Recall that run 1 represents the results when the moel oes not fit, whereas run represents the results when the moel fits.

15 SAISICS IN RANSIION new series an SURVEY MEHODOLOGY 147 Moel oes not fit (Run 1): he ranking of the estimators (from highest RE to lowest RE ) is as follows: Y ˆ CALU, Y ˆPEBLUP, Y ˆEBLUP, Y ˆ REG, Y ˆSYN, an Y ˆ CALU. he traitional omain estimator Y ˆ CALU is oing the best, but it is closely followe by the two small area estimators Y ˆPEBLUP an Y ˆEBLUP. As the sample size increases, there is a ichotomy in terms of RE. he relative efficiency increases for Y ˆ CALU an Y ˆPEBLUP, whereas it ecreases for Y ˆ REG, Y ˆSYN, an Y ˆEBLUP. here is no change to Y ˆ CALU as the auxiliary information is not useful at the omain level. Moel fits (Run ): he ranking of the estimators (from highest RE to lowest RE ) has change with respect to run 1. It is now Y ˆEBLUP Y ˆ REG, Y ˆPEBLUP, Y ˆSYN,. he small area estimators are clearly more, Y ˆ CALU, an Y ˆ CALU efficient than the traitional estimators. he relative efficiency increase for all estimators - maximum is now 14 versus 8 obtaine in run 1. Once more, as the sample size increases, there is a ichotomy in terms of RE. Another way to summarize the behaviour of the various estimators is graphically. We summarize the average absolute relative bias, ARB( Y ˆ ES ), an the average coefficient of variation, CV ( Y ˆ ), within each omain 1,,, D, where D = 9. ES Figures a an b isplay two typical graphs of the absolute relative bias over the omains for the two simulation runs. hese graphs show the results for the sample size of 464. Similar results were obtaine for the other sample sizes. We can see that the absolute relative bias of Y ˆ SYN, Y ˆ EBLUP an Y ˆ PEBLUP is greatly reuce when we specify the correct auxiliary variables in the unerlying moel. In the first run, the small area estimators show a rop an a rise between the groups of omains. his can be explaine. he overall moel fitte using x (1, x ) prouces a regression which is close to the unerlying moel for the secon group of omains. herefore, the ifferences are small for the secon group of omains. However, this overall moel is quite ifferent from the one use to generate the population in the first an thir groups of omains. i i

16 148 M. A. Hiiroglou, V. M. Estevao: A Comparison of small... Moel oes not fit ARB Legen: Y ˆ H : Y ˆ REG : Y ˆ : Y ˆ CALU : CALU Y ˆ SYN : Y ˆ EBLUP : Y ˆ PEBLUP : Figure a. Plots of the absolute relative bias of the estimators for sample size 464 Moel fits ARB Legen: Y ˆ H : Y ˆ REG : Y ˆ : Y ˆ CALU : CALU Y ˆ SYN : Y ˆ EBLUP : Y ˆ PEBLUP : Figure b. Plots of the absolute relative bias of the estimators for sample size 464 Figures 3a an 3b isplay the coefficient of variation associate with the estimators. he coefficient of variation is reuce for all estimators except the H estimator Y ˆ H (which oes not use any auxiliary information) an Y ˆ CALU (because the auxiliary variables for this estimator are equivalent in the two runs).

17 SAISICS IN RANSIION new series an SURVEY MEHODOLOGY 149 Moel oes not fit CV Legen: Y ˆ H : Y ˆ REG : Y ˆ : Y ˆ CALU : CALU Y ˆ SYN : Y ˆ EBLUP : Y ˆ PEBLUP : Figure 3a. Plots of the Coefficient of Variation of the Estimators for sample size 464 Moel fits CV Legen: Y ˆ H : Y ˆ REG : Y ˆ : Y ˆ CALU : CALU Y ˆ SYN : Y ˆ EBLUP : Y ˆ PEBLUP : Figure b. Plots of the Coefficient of Variation of the Estimators for sample size 464

18 150 M. A. Hiiroglou, V. M. Estevao: A Comparison of small... Figures 4a an 4b show a graphical isplay of the results for the average coefficient of variation CV ( Y ˆ ES ) results given in able 6. Uner run, we see that Y ˆ SYN, Y ˆ EBLUP an Y ˆ PEBLUP have the smallest CV ( Y ˆ ES ). All three lines are inistinguishable as they are very close together. Uner run, we see that Y ˆ SYN, Y ˆ EBLUP an Y ˆ PEBLUP have the smallest CV ( Y ˆ ES ). All three lines are inistinguishable as they are very close together. Moel oes not fit CV Figure 4a. Plots of the average coefficient of variation of the estimators by sample size Moel fits CV Figure 4b. Plots of the average coefficient of variation of the estimators by sample size

19 SAISICS IN RANSIION new series an SURVEY MEHODOLOGY 151 Figures 5a an 5b show a graphical isplay of the average relative efficiency of the estimators given in table 7. Uner run, we note that Y ˆ EBLUP an Y ˆ PEBLUP have the highest Moel oes not fit RE( Y ˆ ) over the various sample sizes. ES RE Figure 5a. Plots of the average relative efficiency of the estimators by sample size Moel fits RE Figure 5b. Plots of the average relative efficiency of the estimators by sample size

20 15 M. A. Hiiroglou, V. M. Estevao: A Comparison of small Conclusions We compare via simulation the behavior of a number of traitional omain an small area estimators. he sampling esign use in the simulation, simple ranom sampling without replacement, is a simplification of the sampling esign commonly use for business surveys (stratifie simple ranom sampling without replacement). he estimators using the auxiliary ata either reflecte the moel use to generate the population (moel fits) or i not (moel oes not fit). he simulation esign i not use unequal probability sampling. he aitional complexity of using unequal probability sampling is that we woul have ha to moify our moel-base small area estimators to account for possible informative sampling. However, since we use simple ranom sampling without replacement, we i not have to account for this problem. he conclusions of our simulation are as follows. Comparing the efficiency between the traitional an small area estimators, the results very much epen on whether the moel hols or not. he calibration estimator Y ˆ CALU which only uses auxiliary ata at the population level is not efficient at the omain level whether the moel hols or not. his is in contrast to Y ˆ CALU that uses auxiliary ata at the omain level. he estimator Y ˆ CALU is the best traitional estimator to use when the moel hols. Its average relative efficiency increases as the overall sample size increases. Its weakness is in the smaller omains, where the expecte sample size is smaller than three units, as it cannot be efine when the auxiliary ata consists of two auxiliary variables; in general, when there are p auxiliary variables, we are not be able to efine Y ˆ CALU when the sample size is smaller than p+1 auxiliary variables. When the moel oes not hol, Y ˆ REG is the best traitional estimator to use. However, it is outperforme by the small area estimatorsy ˆ SYN, Y ˆ EBLUP an Y ˆ PEBLUP. he small area estimator Y ˆ EBLUP is the most efficient one when the moel hols, although it is closely followe by Y ˆ SYN an Y ˆ PEBLUP. When the moel oes not hol, the Y ˆ PEBLUP estimator is the most efficient small area estimator; an explanation for this is that it is esign-consistent.

21 SAISICS IN RANSIION new series an SURVEY MEHODOLOGY 153 Acknowlegement We woul like to thank the referee for his constructive comments that substantially improve this paper. REFERENCES BAESE, G. E., HARER, R. M., FULLER, W. A., (1988). An errorcomponent moel for preiction of county crop areas using survey an satellite ata. Journal of the American Statistical Association, 83 (401), DAA, G. S., (009). Moel-base approach to small area estimation. Hanbook of Statistics, 9, DEVILLE, J. C., SÄRNDAL, C. E., (199). Calibration estimation in survey sampling. Journal of the American Statistical Association, 87(418), ESEVAO, V., HIDIROGLOU, M. A., YOU, Y., (014). Methoology Software Library - Small area Estimation Methoology Specifications for Area an Unit Level base Moels. echnical Report, Statistics Canaa. FAY, R. E., HERRIO, R. A., (1979). Estimation of income for small places: an application of James-Stein proceures to census ata. Journal of the American Statistical Association, 74 (366A), LEHONEN, R., VEIJANEN, A., (009). Design-base methos of estimation for omains an small areas. Hanbook of statistics, 9, PFEFFERMANN, D., (013). New important evelopments in small area estimation. Statistical Science, 8(1), PFEFFERMANN, D., SVERCHKOV, M., (007). Small Area Estimation Uner Informative Probability Sampling of Areas an Within the Selecte Areas. Journal of the American Statistical Association 10 (480), RAO, J. N. K., (003). Small Area Estimation: John Wiley & Sons. SINGH, A. C., MIAN, I. U. H., (1995). Generalize Sample Size Depenent Estimators for Small Areas, Proceeings of the 1995 Annual Research Conference, U.S. Bureau of the Census, Washington, DC, SINGH, M. P., GAMBINO, J., MANEL, H., (1994). Issues an Strategies for Small Area Data. Survey Methoology, 0 (1), 3. WOODRUFF, R. S., (1966). Use of a Regression echnique to Prouce Area Breakowns of the Monthly National Estimates of Retail rae. Journal of the American Statistical Association, 61 (314),

22 154 M. A. Hiiroglou, V. M. Estevao: A Comparison of small... YOU, Y., RAO, J. N. K., (00). A pseuo-empirical best linear unbiase preiction approach to small area estimation using survey weights, Canaian Journal of Statistics, 30, VERRE, F., RAO, J. N. K., VERRE, F., RAO, J. N. K., HIDIROGLOU, M. A., (015). Moel-base small area estimation uner informative sampling. o appear in the December 015 issue of Survey Methoology.

The Role of Models in Model-Assisted and Model- Dependent Estimation for Domains and Small Areas

The Role of Models in Model-Assisted and Model- Dependent Estimation for Domains and Small Areas The Role of Moels in Moel-Assiste an Moel- Depenent Estimation for Domains an Small Areas Risto Lehtonen University of Helsini Mio Myrsylä University of Pennsylvania Carl-Eri Särnal University of Montreal

More information

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics This moule is part of the Memobust Hanbook on Methoology of Moern Business Statistics 26 March 2014 Metho: Balance Sampling for Multi-Way Stratification Contents General section... 3 1. Summary... 3 2.

More information

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics

This module is part of the. Memobust Handbook. on Methodology of Modern Business Statistics This moule is part of the Memobust Hanbook on Methoology of Moern Business Statistics 26 March 2014 Metho: EBLUP Unit Level for Small Area Estimation Contents General section... 3 1. Summary... 3 2. General

More information

Deliverable 2.2. Small Area Estimation of Indicators on Poverty and Social Exclusion

Deliverable 2.2. Small Area Estimation of Indicators on Poverty and Social Exclusion Deliverable 2.2 Small Area Estimation of Inicators on Poverty an Social Exclusion Version: 2011 Risto Lehtonen, Ari Veijanen, Mio Myrsylä an Maria Valaste The project FP7-SSH-2007-217322 AMELI is supporte

More information

A comparison of small area estimators of counts aligned with direct higher level estimates

A comparison of small area estimators of counts aligned with direct higher level estimates A comparison of small area estimators of counts aligne with irect higher level estimates Giorgio E. Montanari, M. Giovanna Ranalli an Cecilia Vicarelli Abstract Inirect estimators for small areas use auxiliary

More information

Combining Time Series and Cross-sectional Data for Current Employment Statistics Estimates 1

Combining Time Series and Cross-sectional Data for Current Employment Statistics Estimates 1 JSM015 - Surey Research Methos Section Combining Time Series an Cross-sectional Data for Current Employment Statistics Estimates 1 Julie Gershunskaya U.S. Bureau of Labor Statistics, Massachusetts Ae NE,

More information

Estimation of District Level Poor Households in the State of. Uttar Pradesh in India by Combining NSSO Survey and

Estimation of District Level Poor Households in the State of. Uttar Pradesh in India by Combining NSSO Survey and Int. Statistical Inst.: Proc. 58th Worl Statistical Congress, 2011, Dublin (Session CPS039) p.6567 Estimation of District Level Poor Househols in the State of Uttar Praesh in Inia by Combining NSSO Survey

More information

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013

Survey Sampling. 1 Design-based Inference. Kosuke Imai Department of Politics, Princeton University. February 19, 2013 Survey Sampling Kosuke Imai Department of Politics, Princeton University February 19, 2013 Survey sampling is one of the most commonly use ata collection methos for social scientists. We begin by escribing

More information

Survey-weighted Unit-Level Small Area Estimation

Survey-weighted Unit-Level Small Area Estimation Survey-weighte Unit-Level Small Area Estimation Jan Pablo Burgar an Patricia Dörr Abstract For evience-base regional policy making, geographically ifferentiate estimates of socio-economic inicators are

More information

Estimating International Migration on the Base of Small Area Techniques

Estimating International Migration on the Base of Small Area Techniques MPRA Munich Personal RePEc Archive Estimating International Migration on the Base of Small Area echniques Vergil Voineagu an Nicoleta Caragea an Silvia Pisica 2013 Online at http://mpra.ub.uni-muenchen.e/48775/

More information

Estimating Unemployment for Small Areas in Navarra, Spain

Estimating Unemployment for Small Areas in Navarra, Spain Estimating Unemployment for Small Areas in Navarra, Spain Ugarte, M.D., Militino, A.F., an Goicoa, T. Departamento e Estaística e Investigación Operativa, Universia Pública e Navarra Campus e Arrosaía,

More information

A Fay Herriot Model for Estimating the Proportion of Households in Poverty in Brazilian Municipalities

A Fay Herriot Model for Estimating the Proportion of Households in Poverty in Brazilian Municipalities Int. Statistical Inst.: Proc. 58th Worl Statistical Congress, 2011, Dublin (Session CPS016) p.4218 A Fay Herriot Moel for Estimating the Proportion of Househols in Poverty in Brazilian Municipalities Quintaes,

More information

Small Area Estimation: A Review of Methods Based on the Application of Mixed Models. Ayoub Saei, Ray Chambers. Abstract

Small Area Estimation: A Review of Methods Based on the Application of Mixed Models. Ayoub Saei, Ray Chambers. Abstract Small Area Estimation: A Review of Methos Base on the Application of Mixe Moels Ayoub Saei, Ray Chambers Abstract This is the review component of the report on small area estimation theory that was prepare

More information

Least-Squares Regression on Sparse Spaces

Least-Squares Regression on Sparse Spaces Least-Squares Regression on Sparse Spaces Yuri Grinberg, Mahi Milani Far, Joelle Pineau School of Computer Science McGill University Montreal, Canaa {ygrinb,mmilan1,jpineau}@cs.mcgill.ca 1 Introuction

More information

Improving Estimation Accuracy in Nonrandomized Response Questioning Methods by Multiple Answers

Improving Estimation Accuracy in Nonrandomized Response Questioning Methods by Multiple Answers International Journal of Statistics an Probability; Vol 6, No 5; September 207 ISSN 927-7032 E-ISSN 927-7040 Publishe by Canaian Center of Science an Eucation Improving Estimation Accuracy in Nonranomize

More information

β ˆ j, and the SD path uses the local gradient

β ˆ j, and the SD path uses the local gradient Proceeings of the 00 Winter Simulation Conference E. Yücesan, C.-H. Chen, J. L. Snowon, an J. M. Charnes, es. RESPONSE SURFACE METHODOLOGY REVISITED Ebru Angün Jack P.C. Kleijnen Department of Information

More information

Spurious Significance of Treatment Effects in Overfitted Fixed Effect Models Albrecht Ritschl 1 LSE and CEPR. March 2009

Spurious Significance of Treatment Effects in Overfitted Fixed Effect Models Albrecht Ritschl 1 LSE and CEPR. March 2009 Spurious Significance of reatment Effects in Overfitte Fixe Effect Moels Albrecht Ritschl LSE an CEPR March 2009 Introuction Evaluating subsample means across groups an time perios is common in panel stuies

More information

Contributors: France, Germany, Italy, Netherlands, Norway, Poland, Spain, Switzerland

Contributors: France, Germany, Italy, Netherlands, Norway, Poland, Spain, Switzerland ESSNET ON SMALL AREA ESTIMATION REPORT ON WORK PACKAGE 5 CASE STUDIES FINAL VERSION REVISION FEBRUARY 1 Contributors: France, Germany, Italy, Netherlans, Norway, Polan, Spain, Switzerlan Contents 1. Overview

More information

SYNCHRONOUS SEQUENTIAL CIRCUITS

SYNCHRONOUS SEQUENTIAL CIRCUITS CHAPTER SYNCHRONOUS SEUENTIAL CIRCUITS Registers an counters, two very common synchronous sequential circuits, are introuce in this chapter. Register is a igital circuit for storing information. Contents

More information

Optimization of Geometries by Energy Minimization

Optimization of Geometries by Energy Minimization Optimization of Geometries by Energy Minimization by Tracy P. Hamilton Department of Chemistry University of Alabama at Birmingham Birmingham, AL 3594-140 hamilton@uab.eu Copyright Tracy P. Hamilton, 1997.

More information

Quality competition versus price competition goods: An empirical classification

Quality competition versus price competition goods: An empirical classification HEID Working Paper No7/2008 Quality competition versus price competition goos: An empirical classification Richar E. Balwin an Taashi Ito Grauate Institute of International an Development Stuies Abstract

More information

The new concepts of measurement error s regularities and effect characteristics

The new concepts of measurement error s regularities and effect characteristics The new concepts of measurement error s regularities an effect characteristics Ye Xiaoming[1,] Liu Haibo [3,,] Ling Mo[3] Xiao Xuebin [5] [1] School of Geoesy an Geomatics, Wuhan University, Wuhan, Hubei,

More information

UNIFYING PCA AND MULTISCALE APPROACHES TO FAULT DETECTION AND ISOLATION

UNIFYING PCA AND MULTISCALE APPROACHES TO FAULT DETECTION AND ISOLATION UNIFYING AND MULISCALE APPROACHES O FAUL DEECION AND ISOLAION Seongkyu Yoon an John F. MacGregor Dept. Chemical Engineering, McMaster University, Hamilton Ontario Canaa L8S 4L7 yoons@mcmaster.ca macgreg@mcmaster.ca

More information

ESTP course on Small Area Estimation

ESTP course on Small Area Estimation ESTP course on Small Area Estimation Statistics Finland, Helsinki, 29 September 2 October 2014 Topic 1: Introduction to small area estimation Risto Lehtonen, University of Helsinki Lecture topics: Monday

More information

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions

Computing Exact Confidence Coefficients of Simultaneous Confidence Intervals for Multinomial Proportions and their Functions Working Paper 2013:5 Department of Statistics Computing Exact Confience Coefficients of Simultaneous Confience Intervals for Multinomial Proportions an their Functions Shaobo Jin Working Paper 2013:5

More information

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs

Lectures - Week 10 Introduction to Ordinary Differential Equations (ODES) First Order Linear ODEs Lectures - Week 10 Introuction to Orinary Differential Equations (ODES) First Orer Linear ODEs When stuying ODEs we are consiering functions of one inepenent variable, e.g., f(x), where x is the inepenent

More information

Thermal conductivity of graded composites: Numerical simulations and an effective medium approximation

Thermal conductivity of graded composites: Numerical simulations and an effective medium approximation JOURNAL OF MATERIALS SCIENCE 34 (999)5497 5503 Thermal conuctivity of grae composites: Numerical simulations an an effective meium approximation P. M. HUI Department of Physics, The Chinese University

More information

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments

Time-of-Arrival Estimation in Non-Line-Of-Sight Environments 2 Conference on Information Sciences an Systems, The Johns Hopkins University, March 2, 2 Time-of-Arrival Estimation in Non-Line-Of-Sight Environments Sinan Gezici, Hisashi Kobayashi an H. Vincent Poor

More information

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks

A PAC-Bayesian Approach to Spectrally-Normalized Margin Bounds for Neural Networks A PAC-Bayesian Approach to Spectrally-Normalize Margin Bouns for Neural Networks Behnam Neyshabur, Srinah Bhojanapalli, Davi McAllester, Nathan Srebro Toyota Technological Institute at Chicago {bneyshabur,

More information

M-Quantile Regression for Binary Data with Application to Small Area Estimation

M-Quantile Regression for Binary Data with Application to Small Area Estimation University of Wollongong Research Online Centre for Statistical & Survey Methoology Working Paper Series Faculty of Engineering an Information Sciences 2012 M-Quantile Regression for Binary Data with Application

More information

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION

LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION The Annals of Statistics 1997, Vol. 25, No. 6, 2313 2327 LATTICE-BASED D-OPTIMUM DESIGN FOR FOURIER REGRESSION By Eva Riccomagno, 1 Rainer Schwabe 2 an Henry P. Wynn 1 University of Warwick, Technische

More information

CONTROL CHARTS FOR VARIABLES

CONTROL CHARTS FOR VARIABLES UNIT CONTOL CHATS FO VAIABLES Structure.1 Introuction Objectives. Control Chart Technique.3 Control Charts for Variables.4 Control Chart for Mean(-Chart).5 ange Chart (-Chart).6 Stanar Deviation Chart

More information

Online Appendix for Trade Policy under Monopolistic Competition with Firm Selection

Online Appendix for Trade Policy under Monopolistic Competition with Firm Selection Online Appenix for Trae Policy uner Monopolistic Competition with Firm Selection Kyle Bagwell Stanfor University an NBER Seung Hoon Lee Georgia Institute of Technology September 6, 2018 In this Online

More information

inflow outflow Part I. Regular tasks for MAE598/494 Task 1

inflow outflow Part I. Regular tasks for MAE598/494 Task 1 MAE 494/598, Fall 2016 Project #1 (Regular tasks = 20 points) Har copy of report is ue at the start of class on the ue ate. The rules on collaboration will be release separately. Please always follow the

More information

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21

'HVLJQ &RQVLGHUDWLRQ LQ 0DWHULDO 6HOHFWLRQ 'HVLJQ 6HQVLWLYLW\,1752'8&7,21 Large amping in a structural material may be either esirable or unesirable, epening on the engineering application at han. For example, amping is a esirable property to the esigner concerne with limiting

More information

The total derivative. Chapter Lagrangian and Eulerian approaches

The total derivative. Chapter Lagrangian and Eulerian approaches Chapter 5 The total erivative 51 Lagrangian an Eulerian approaches The representation of a flui through scalar or vector fiels means that each physical quantity uner consieration is escribe as a function

More information

MODELLING DEPENDENCE IN INSURANCE CLAIMS PROCESSES WITH LÉVY COPULAS ABSTRACT KEYWORDS

MODELLING DEPENDENCE IN INSURANCE CLAIMS PROCESSES WITH LÉVY COPULAS ABSTRACT KEYWORDS MODELLING DEPENDENCE IN INSURANCE CLAIMS PROCESSES WITH LÉVY COPULAS BY BENJAMIN AVANZI, LUKE C. CASSAR AND BERNARD WONG ABSTRACT In this paper we investigate the potential of Lévy copulas as a tool for

More information

A Modification of the Jarque-Bera Test. for Normality

A Modification of the Jarque-Bera Test. for Normality Int. J. Contemp. Math. Sciences, Vol. 8, 01, no. 17, 84-85 HIKARI Lt, www.m-hikari.com http://x.oi.org/10.1988/ijcms.01.9106 A Moification of the Jarque-Bera Test for Normality Moawa El-Fallah Ab El-Salam

More information

Diophantine Approximations: Examining the Farey Process and its Method on Producing Best Approximations

Diophantine Approximations: Examining the Farey Process and its Method on Producing Best Approximations Diophantine Approximations: Examining the Farey Process an its Metho on Proucing Best Approximations Kelly Bowen Introuction When a person hears the phrase irrational number, one oes not think of anything

More information

ELECTRON DIFFRACTION

ELECTRON DIFFRACTION ELECTRON DIFFRACTION Electrons : wave or quanta? Measurement of wavelength an momentum of electrons. Introuction Electrons isplay both wave an particle properties. What is the relationship between the

More information

Closed and Open Loop Optimal Control of Buffer and Energy of a Wireless Device

Closed and Open Loop Optimal Control of Buffer and Energy of a Wireless Device Close an Open Loop Optimal Control of Buffer an Energy of a Wireless Device V. S. Borkar School of Technology an Computer Science TIFR, umbai, Inia. borkar@tifr.res.in A. A. Kherani B. J. Prabhu INRIA

More information

Research Article When Inflation Causes No Increase in Claim Amounts

Research Article When Inflation Causes No Increase in Claim Amounts Probability an Statistics Volume 2009, Article ID 943926, 10 pages oi:10.1155/2009/943926 Research Article When Inflation Causes No Increase in Claim Amounts Vytaras Brazauskas, 1 Bruce L. Jones, 2 an

More information

2Algebraic ONLINE PAGE PROOFS. foundations

2Algebraic ONLINE PAGE PROOFS. foundations Algebraic founations. Kick off with CAS. Algebraic skills.3 Pascal s triangle an binomial expansions.4 The binomial theorem.5 Sets of real numbers.6 Surs.7 Review . Kick off with CAS Playing lotto Using

More information

VI. Linking and Equating: Getting from A to B Unleashing the full power of Rasch models means identifying, perhaps conceiving an important aspect,

VI. Linking and Equating: Getting from A to B Unleashing the full power of Rasch models means identifying, perhaps conceiving an important aspect, VI. Linking an Equating: Getting from A to B Unleashing the full power of Rasch moels means ientifying, perhaps conceiving an important aspect, efining a useful construct, an calibrating a pool of relevant

More information

CUSTOMER REVIEW FEATURE EXTRACTION Heng Ren, Jingye Wang, and Tony Wu

CUSTOMER REVIEW FEATURE EXTRACTION Heng Ren, Jingye Wang, and Tony Wu CUSTOMER REVIEW FEATURE EXTRACTION Heng Ren, Jingye Wang, an Tony Wu Abstract Popular proucts often have thousans of reviews that contain far too much information for customers to igest. Our goal for the

More information

Copyright 2015 Quintiles

Copyright 2015 Quintiles fficiency of Ranomize Concentration-Controlle Trials Relative to Ranomize Dose-Controlle Trials, an Application to Personalize Dosing Trials Russell Reeve, PhD Quintiles, Inc. Copyright 2015 Quintiles

More information

Topic 7: Convergence of Random Variables

Topic 7: Convergence of Random Variables Topic 7: Convergence of Ranom Variables Course 003, 2016 Page 0 The Inference Problem So far, our starting point has been a given probability space (S, F, P). We now look at how to generate information

More information

A Novel Unknown-Input Estimator for Disturbance Estimation and Compensation

A Novel Unknown-Input Estimator for Disturbance Estimation and Compensation A Novel Unknown-Input Estimator for Disturbance Estimation an Compensation Difan ang Lei Chen Eric Hu School of Mechanical Engineering he University of Aelaie Aelaie South Australia 5005 Australia leichen@aelaieeuau

More information

. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences.

. Using a multinomial model gives us the following equation for P d. , with respect to same length term sequences. S 63 Lecture 8 2/2/26 Lecturer Lillian Lee Scribes Peter Babinski, Davi Lin Basic Language Moeling Approach I. Special ase of LM-base Approach a. Recap of Formulas an Terms b. Fixing θ? c. About that Multinomial

More information

Qubit channels that achieve capacity with two states

Qubit channels that achieve capacity with two states Qubit channels that achieve capacity with two states Dominic W. Berry Department of Physics, The University of Queenslan, Brisbane, Queenslan 4072, Australia Receive 22 December 2004; publishe 22 March

More information

Web Appendix to Firm Heterogeneity and Aggregate Welfare (Not for Publication)

Web Appendix to Firm Heterogeneity and Aggregate Welfare (Not for Publication) Web ppeni to Firm Heterogeneity an ggregate Welfare Not for Publication Marc J. Melitz Harvar University, NBER, an CEPR Stephen J. Reing Princeton University, NBER, an CEPR March 6, 203 Introuction his

More information

Small Domain Estimation for a Brazilian Service Sector Survey

Small Domain Estimation for a Brazilian Service Sector Survey Proceedings 59th ISI World Statistics Congress, 5-30 August 013, Hong Kong (Session CPS003) p.334 Small Domain Estimation for a Brazilian Service Sector Survey André Neves 1, Denise Silva and Solange Correa

More information

Capacity Analysis of MIMO Systems with Unknown Channel State Information

Capacity Analysis of MIMO Systems with Unknown Channel State Information Capacity Analysis of MIMO Systems with Unknown Channel State Information Jun Zheng an Bhaskar D. Rao Dept. of Electrical an Computer Engineering University of California at San Diego e-mail: juzheng@ucs.eu,

More information

New Statistical Test for Quality Control in High Dimension Data Set

New Statistical Test for Quality Control in High Dimension Data Set International Journal of Applie Engineering Research ISSN 973-456 Volume, Number 6 (7) pp. 64-649 New Statistical Test for Quality Control in High Dimension Data Set Shamshuritawati Sharif, Suzilah Ismail

More information

IPA Derivatives for Make-to-Stock Production-Inventory Systems With Backorders Under the (R,r) Policy

IPA Derivatives for Make-to-Stock Production-Inventory Systems With Backorders Under the (R,r) Policy IPA Derivatives for Make-to-Stock Prouction-Inventory Systems With Backorers Uner the (Rr) Policy Yihong Fan a Benamin Melame b Yao Zhao c Yorai Wari Abstract This paper aresses Infinitesimal Perturbation

More information

Gaussian processes with monotonicity information

Gaussian processes with monotonicity information Gaussian processes with monotonicity information Anonymous Author Anonymous Author Unknown Institution Unknown Institution Abstract A metho for using monotonicity information in multivariate Gaussian process

More information

EVALUATION OF LIQUEFACTION RESISTANCE AND LIQUEFACTION INDUCED SETTLEMENT FOR RECLAIMED SOIL

EVALUATION OF LIQUEFACTION RESISTANCE AND LIQUEFACTION INDUCED SETTLEMENT FOR RECLAIMED SOIL 386 EVALUATION OF LIQUEFACTION RESISTANCE AND LIQUEFACTION INDUCED SETTLEMENT FOR RECLAIMED SOIL Lien-Kwei CHIEN 1, Yan-Nam OH 2 An Chih-Hsin CHANG 3 SUMMARY In this stuy, the fille material in Yun-Lin

More information

Flexible High-Dimensional Classification Machines and Their Asymptotic Properties

Flexible High-Dimensional Classification Machines and Their Asymptotic Properties Journal of Machine Learning Research 16 (2015) 1547-1572 Submitte 1/14; Revise 9/14; Publishe 8/15 Flexible High-Dimensional Classification Machines an Their Asymptotic Properties Xingye Qiao Department

More information

Bohr Model of the Hydrogen Atom

Bohr Model of the Hydrogen Atom Class 2 page 1 Bohr Moel of the Hyrogen Atom The Bohr Moel of the hyrogen atom assumes that the atom consists of one electron orbiting a positively charge nucleus. Although it oes NOT o a goo job of escribing

More information

Chapter 9 Method of Weighted Residuals

Chapter 9 Method of Weighted Residuals Chapter 9 Metho of Weighte Resiuals 9- Introuction Metho of Weighte Resiuals (MWR) is an approimate technique for solving bounary value problems. It utilizes a trial functions satisfying the prescribe

More information

American Society of Agricultural Engineers PAPER NO PRAIRIE RAINFALL,CHARACTERISTICS

American Society of Agricultural Engineers PAPER NO PRAIRIE RAINFALL,CHARACTERISTICS - PAPER NO. 79-2108 PRAIRIE RAINFALL,CHARACTERISTICS G.E. Dyck an D.M. Gray Research Engineer an Chairman Division of Hyrology University of Saskatchewan Saskatoon, Saskatchewan, Canaa For presentation

More information

Problems Governed by PDE. Shlomo Ta'asan. Carnegie Mellon University. and. Abstract

Problems Governed by PDE. Shlomo Ta'asan. Carnegie Mellon University. and. Abstract Pseuo-Time Methos for Constraine Optimization Problems Governe by PDE Shlomo Ta'asan Carnegie Mellon University an Institute for Computer Applications in Science an Engineering Abstract In this paper we

More information

Similarity Measures for Categorical Data A Comparative Study. Technical Report

Similarity Measures for Categorical Data A Comparative Study. Technical Report Similarity Measures for Categorical Data A Comparative Stuy Technical Report Department of Computer Science an Engineering University of Minnesota 4-92 EECS Builing 200 Union Street SE Minneapolis, MN

More information

Construction of the Electronic Radial Wave Functions and Probability Distributions of Hydrogen-like Systems

Construction of the Electronic Radial Wave Functions and Probability Distributions of Hydrogen-like Systems Construction of the Electronic Raial Wave Functions an Probability Distributions of Hyrogen-like Systems Thomas S. Kuntzleman, Department of Chemistry Spring Arbor University, Spring Arbor MI 498 tkuntzle@arbor.eu

More information

Image Denoising Using Spatial Adaptive Thresholding

Image Denoising Using Spatial Adaptive Thresholding International Journal of Engineering Technology, Management an Applie Sciences Image Denoising Using Spatial Aaptive Thresholing Raneesh Mishra M. Tech Stuent, Department of Electronics & Communication,

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri ITA, UC San Diego, 9500 Gilman Drive, La Jolla, CA Sham M. Kakae Karen Livescu Karthik Sriharan Toyota Technological Institute at Chicago, 6045 S. Kenwoo Ave., Chicago, IL kamalika@soe.ucs.eu

More information

arxiv: v1 [math.co] 29 May 2009

arxiv: v1 [math.co] 29 May 2009 arxiv:0905.4913v1 [math.co] 29 May 2009 simple Havel-Hakimi type algorithm to realize graphical egree sequences of irecte graphs Péter L. Erős an István Miklós. Rényi Institute of Mathematics, Hungarian

More information

A Course in Machine Learning

A Course in Machine Learning A Course in Machine Learning Hal Daumé III 12 EFFICIENT LEARNING So far, our focus has been on moels of learning an basic algorithms for those moels. We have not place much emphasis on how to learn quickly.

More information

Nonparametric Additive Models

Nonparametric Additive Models Nonparametric Aitive Moels Joel L. Horowitz The Institute for Fiscal Stuies Department of Economics, UCL cemmap working paper CWP20/2 Nonparametric Aitive Moels Joel L. Horowitz. INTRODUCTION Much applie

More information

Final Exam Study Guide and Practice Problems Solutions

Final Exam Study Guide and Practice Problems Solutions Final Exam Stuy Guie an Practice Problems Solutions Note: These problems are just some of the types of problems that might appear on the exam. However, to fully prepare for the exam, in aition to making

More information

Improved Geoid Model for Syria Using the Local Gravimetric and GPS Measurements 1

Improved Geoid Model for Syria Using the Local Gravimetric and GPS Measurements 1 Improve Geoi Moel for Syria Using the Local Gravimetric an GPS Measurements 1 Ria Al-Masri 2 Abstract "The objective of this paper is to iscuss recent research one by the author to evelop a new improve

More information

How to Minimize Maximum Regret in Repeated Decision-Making

How to Minimize Maximum Regret in Repeated Decision-Making How to Minimize Maximum Regret in Repeate Decision-Making Karl H. Schlag July 3 2003 Economics Department, European University Institute, Via ella Piazzuola 43, 033 Florence, Italy, Tel: 0039-0-4689, email:

More information

On the Surprising Behavior of Distance Metrics in High Dimensional Space

On the Surprising Behavior of Distance Metrics in High Dimensional Space On the Surprising Behavior of Distance Metrics in High Dimensional Space Charu C. Aggarwal, Alexaner Hinneburg 2, an Daniel A. Keim 2 IBM T. J. Watson Research Center Yortown Heights, NY 0598, USA. charu@watson.ibm.com

More information

PARALLEL-PLATE CAPACITATOR

PARALLEL-PLATE CAPACITATOR Physics Department Electric an Magnetism Laboratory PARALLEL-PLATE CAPACITATOR 1. Goal. The goal of this practice is the stuy of the electric fiel an electric potential insie a parallelplate capacitor.

More information

Tutorial on Maximum Likelyhood Estimation: Parametric Density Estimation

Tutorial on Maximum Likelyhood Estimation: Parametric Density Estimation Tutorial on Maximum Likelyhoo Estimation: Parametric Density Estimation Suhir B Kylasa 03/13/2014 1 Motivation Suppose one wishes to etermine just how biase an unfair coin is. Call the probability of tossing

More information

Logarithmic spurious regressions

Logarithmic spurious regressions Logarithmic spurious regressions Robert M. e Jong Michigan State University February 5, 22 Abstract Spurious regressions, i.e. regressions in which an integrate process is regresse on another integrate

More information

Package asus. October 25, 2017

Package asus. October 25, 2017 Package asus October 25, 2017 Title Aaptive SURE Thresholing Using Sie Information Version 1.0.0 Provies the ASUS proceure for estimating a high imensional sparse parameter in the presence of auxiliary

More information

arxiv: v1 [hep-ex] 4 Sep 2018 Simone Ragoni, for the ALICE Collaboration

arxiv: v1 [hep-ex] 4 Sep 2018 Simone Ragoni, for the ALICE Collaboration Prouction of pions, kaons an protons in Xe Xe collisions at s =. ev arxiv:09.0v [hep-ex] Sep 0, for the ALICE Collaboration Università i Bologna an INFN (Bologna) E-mail: simone.ragoni@cern.ch In late

More information

Quantile function expansion using regularly varying functions

Quantile function expansion using regularly varying functions Quantile function expansion using regularly varying functions arxiv:705.09494v [math.st] 9 Aug 07 Thomas Fung a, an Eugene Seneta b a Department of Statistics, Macquarie University, NSW 09, Australia b

More information

7.1 Support Vector Machine

7.1 Support Vector Machine 67577 Intro. to Machine Learning Fall semester, 006/7 Lecture 7: Support Vector Machines an Kernel Functions II Lecturer: Amnon Shashua Scribe: Amnon Shashua 7. Support Vector Machine We return now to

More information

A Simple Model for the Calculation of Plasma Impedance in Atmospheric Radio Frequency Discharges

A Simple Model for the Calculation of Plasma Impedance in Atmospheric Radio Frequency Discharges Plasma Science an Technology, Vol.16, No.1, Oct. 214 A Simple Moel for the Calculation of Plasma Impeance in Atmospheric Raio Frequency Discharges GE Lei ( ) an ZHANG Yuantao ( ) Shanong Provincial Key

More information

A Coverage Model for Improving Public Transit System Accessibility and Expanding Access

A Coverage Model for Improving Public Transit System Accessibility and Expanding Access A Coverage Moel for Improving Public Transit System Accessibility an Expaning Access Alan T. Murray Department of Geography The Ohio State University Columbus, Ohio 43210 USA (Email: murray.308@osu.eu)

More information

Chapter 3 Notes, Applied Calculus, Tan

Chapter 3 Notes, Applied Calculus, Tan Contents 3.1 Basic Rules of Differentiation.............................. 2 3.2 The Prouct an Quotient Rules............................ 6 3.3 The Chain Rule...................................... 9 3.4

More information

A Minimum Variance Method for Lidar Signal Inversion

A Minimum Variance Method for Lidar Signal Inversion 468 J O U R N A L O F A T M O S P H E R I C A N D O C E A N I C T E C H N O L O G Y VOLUME 31 A Minimum Variance Metho for Liar Signal Inversion ANDREJA SU SNIK Centre for Atmospheric Research, University

More information

Unit #6 - Families of Functions, Taylor Polynomials, l Hopital s Rule

Unit #6 - Families of Functions, Taylor Polynomials, l Hopital s Rule Unit # - Families of Functions, Taylor Polynomials, l Hopital s Rule Some problems an solutions selecte or aapte from Hughes-Hallett Calculus. Critical Points. Consier the function f) = 54 +. b) a) Fin

More information

On colour-blind distinguishing colour pallets in regular graphs

On colour-blind distinguishing colour pallets in regular graphs J Comb Optim (2014 28:348 357 DOI 10.1007/s10878-012-9556-x On colour-blin istinguishing colour pallets in regular graphs Jakub Przybyło Publishe online: 25 October 2012 The Author(s 2012. This article

More information

Estimation of the Maximum Domination Value in Multi-Dimensional Data Sets

Estimation of the Maximum Domination Value in Multi-Dimensional Data Sets Proceeings of the 4th East-European Conference on Avances in Databases an Information Systems ADBIS) 200 Estimation of the Maximum Domination Value in Multi-Dimensional Data Sets Eleftherios Tiakas, Apostolos.

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Keywors: multi-view learning, clustering, canonical correlation analysis Abstract Clustering ata in high-imensions is believe to be a har problem in general. A number of efficient clustering algorithms

More information

SEDIMENT SCOUR AT PIERS WITH COMPLEX GEOMETRIES

SEDIMENT SCOUR AT PIERS WITH COMPLEX GEOMETRIES SEDIMENT SCOUR AT PIERS WITH COMPLEX GEOMETRIES D. MAX SHEPPARD Civil an Coastal Engineering Department, University of Floria, 365 Weil Hall Gainesville, Floria 3611, US TOM L. GLASSER Ocean Engineering

More information

R package sae: Methodology

R package sae: Methodology R package sae: Methoology Isabel Molina *, Yolana Marhuena March, 2015 Contents 1 Introuction 3 2 Function irect 3 2.1 Sampling without replacement.................... 3 2.2 Sampling with replacement......................

More information

Simple Tests for Exogeneity of a Binary Explanatory Variable in Count Data Regression Models

Simple Tests for Exogeneity of a Binary Explanatory Variable in Count Data Regression Models Communications in Statistics Simulation an Computation, 38: 1834 1855, 2009 Copyright Taylor & Francis Group, LLC ISSN: 0361-0918 print/1532-4141 online DOI: 10.1080/03610910903147789 Simple Tests for

More information

Image Based Monitoring for Combustion Systems

Image Based Monitoring for Combustion Systems Image Base onitoring for Combustion Systems J. Chen, ember, IAENG an.-y. Hsu Abstract A novel metho of on-line flame etection in vieo is propose. It aims to early etect the current state of the combustion

More information

Bivariate distributions characterized by one family of conditionals and conditional percentile or mode functions

Bivariate distributions characterized by one family of conditionals and conditional percentile or mode functions Journal of Multivariate Analysis 99 2008) 1383 1392 www.elsevier.com/locate/jmva Bivariate istributions characterize by one family of conitionals an conitional ercentile or moe functions Barry C. Arnol

More information

Lower Bounds for the Smoothed Number of Pareto optimal Solutions

Lower Bounds for the Smoothed Number of Pareto optimal Solutions Lower Bouns for the Smoothe Number of Pareto optimal Solutions Tobias Brunsch an Heiko Röglin Department of Computer Science, University of Bonn, Germany brunsch@cs.uni-bonn.e, heiko@roeglin.org Abstract.

More information

The influence of the equivalent hydraulic diameter on the pressure drop prediction of annular test section

The influence of the equivalent hydraulic diameter on the pressure drop prediction of annular test section IOP Conference Series: Materials Science an Engineering PAPER OPEN ACCESS The influence of the equivalent hyraulic iameter on the pressure rop preiction of annular test section To cite this article: A

More information

Cascaded redundancy reduction

Cascaded redundancy reduction Network: Comput. Neural Syst. 9 (1998) 73 84. Printe in the UK PII: S0954-898X(98)88342-5 Cascae reunancy reuction Virginia R e Sa an Geoffrey E Hinton Department of Computer Science, University of Toronto,

More information

Linear Regression with Limited Observation

Linear Regression with Limited Observation Ela Hazan Tomer Koren Technion Israel Institute of Technology, Technion City 32000, Haifa, Israel ehazan@ie.technion.ac.il tomerk@cs.technion.ac.il Abstract We consier the most common variants of linear

More information

Contextual Effects in Modeling for Small Domains

Contextual Effects in Modeling for Small Domains University of Wollongong Research Online Applied Statistics Education and Research Collaboration (ASEARC) - Conference Papers Faculty of Engineering and Information Sciences 2011 Contextual Effects in

More information

Hybrid Fusion for Biometrics: Combining Score-level and Decision-level Fusion

Hybrid Fusion for Biometrics: Combining Score-level and Decision-level Fusion Hybri Fusion for Biometrics: Combining Score-level an Decision-level Fusion Qian Tao Raymon Velhuis Signals an Systems Group, University of Twente Postbus 217, 7500AE Enschee, the Netherlans {q.tao,r.n.j.velhuis}@ewi.utwente.nl

More information

Multi-View Clustering via Canonical Correlation Analysis

Multi-View Clustering via Canonical Correlation Analysis Kamalika Chauhuri ITA, UC San Diego, 9500 Gilman Drive, La Jolla, CA Sham M. Kakae Karen Livescu Karthik Sriharan Toyota Technological Institute at Chicago, 6045 S. Kenwoo Ave., Chicago, IL kamalika@soe.ucs.eu

More information