Estimation of Some Proportion in a Clustered Population

Size: px
Start display at page:

Download "Estimation of Some Proportion in a Clustered Population"

Transcription

1 Nonlinear Analysis: Modelling and Control, 2009, Vol. 14, No. 4, Estimation of Some Proportion in a Clustered Population D. Krapavicaitė Institute of Mathematics and Informatics Aademijos str. 4, LT Vilnius, Lithuania rapav@tl.mii.lt Received: Revised: Published online: Abstract. The aim of this paper is to show that the traditional design-based estimator for the proportion of population units, associated with at least one subunit having an attribute of interest using the two-stage sampling design, is biased. We face such a situation in the Adult Education Survey of official statistics of the European countries when estimating the share of individuals in non-formal education, involved in job-related learning activities. The alternative design and model-based estimators are proposed. Keywords: finite population, proportion, two-stage sampling design, bias. 1 Introduction A new problem related to the estimation of a proportion has arisen in the Adult Education Survey of the European countries ( [1 3], hereinafter referred to as AES. The parameter of interest is the share of individuals in non-formal education involved in job-related learning activities. According to the sampling design, the individuals included into the first stage AES sample, present the second stage simple random sample of sie m 3 of the learning activities of non-formal education in which they have been involved during a year. Some of them are job-related, but some of them are not job-related. Even if there are no job-related learning activities in the sample, they can occur among non-sampled ones, and have to be taen into account. The problem has arisen in practical wor. The author has not met any similar problem solved, or at least touched, in the literature. In the paper, the problem is described in the general framewor, and it is shown by an example that the design-based estimator of this parameter is biased, and the sie of bias is demonstrated by an example. In order to tae into account possible non-sampled job-related learning activities for sampled individuals assumption on the distribution of the number of such learning activities for each individual is made. Alternative design and model-based estimators are proposed. Their application is shown by Examples 5,

2 D. Krapavicaitė 2 Population and parameters Let us denote by U 1 = {u 1, u 2,..., u N }, (or U 1 = {1, 2,..., N} without restriction of generality population of units, with each of which a cluster of subunits U 2i of sie M i, i = 1, 2,...,N, is associated. Thus, the population of all subunits U 2 consists of M = M M N elements: U 2 = N U 2i. Suppose that some of the subunits have an attribute of interest, some of them do not have it. Let us introduce an attribute indicator a study variable in population U 1 with the value i = 1, if there is at least one subunit with the attribute among M i subunits associated with the unit u i, and i = 0, otherwise, i = 1, 2,...,N. Then the number of units in the population associated with at least one subunit having an attribute is equal to the total of the variable : t = i. (1 The share (proportion of the units in U 1 having at least one subunit associated with the attribute is equal to the mean of the variable : µ = t /N. Let us consider estimation of parameters t and µ from the survey data. 3 Sample and the usual estimator The sample design of subunits that consistute population U 2 is described by a 2-stage sampling design with some probabilistic sample s I of n units from U 1, at the first stage, and a simple random sample s IIi of m i subunits in the cluster associated with the unit u i (or all of them if their number is smaller than m i, at the second stage: s = i s I s IIi U 2, s IIi U 2i. At the second stage, the sie m i of the sample s IIi can be any positive number, but, for simplicity, without loss of the generality, let us consider { M i, if M i = 0, 1, 2, m i = 3, if M i 3, for i s I. This is the case in the Lithuanian AES. Denote by d i = 1/π i the first stage sampling design weight with the first and second order inclusion probabilities π i = P(s I : i s I > 0, π ii = π i, π ij = P(s I : i s I & j s I > 0, i, j, U 1, i j. The Horvit-Thompson estimator of the population total t of the variable i π i s i I 474

3 Estimation of Some Proportion in a Clustered Population cannot be used to estimate the number of units, associated at least one subunit with the attribute in the population U 1 because for m i < M i the values i may be not observable. The often used design-based estimator for the number of units associated with at least one subunit having an attribute is ˆt = i: i s I d i ẑ i, (2 where ẑ i is the design-based estimator of i : { 1, if at least one subunit with the attribute belongs to s IIi, ẑ i = 0, otherwise. (3 For the share of units associated with at least one subunit having an attribute, the suggested design-based estimator is µ = ˆt /N. (4 This estimator is usually used in the pilot AES of statistical offices in European community. Hypothesis: estimators (2 and (4 are biased, e. g., Eˆt t, E µ µ, the expectation is taen here with respect to the two-stage sampling design. The following examples confirm the hypothesis. 4 Bias We show by example 1 that using design-based approach to the problem we unavoidably obtain the biased estimator of a parameter. Example 1 (Existence of a bias of estimator ˆt. Let us study a Small Population U 1 = {u 1, u 2, u 3 } consisting of N = 3 units. The unit u 1 is associated with one subunit without an attribute, denoted as nonattr; the unit u 2 is associated with one subunit with an attribute, denoted as attrib; the unit u 3 is associated with two subunits: one with an attribute (attrib and one without an attribute (nonattr. For this population, the number of units with an attribute and their share is equal to t = = = 2, µ = 2/3. Let us draw the first-stage simple random sample s I of n = 2 elements from population U 1. The possible realiations of the sample according to this sampling design and their sampling probabilities are: s I1 = (u 1, u 2, s I2 = (u 1, u 3, s I3 = (u 2, u 3, P(s I1 = P(s I2 = P(s I3 =

4 D. Krapavicaitė Let us simplify the sample design, taing for the sample of subunits { M i, for M i = 0, m i = 1, for M i 1. The second stage sampling design probabilities are as follows: P(nonattr u 1 = 1, P(attrib u 1 = 0, P(nonattr u 2 = 0, P(attrib u 2 = 1, P(nonattr u 3 = P(attrib u 3 = 1 2. Let us estimate t using estimator (2 and the data of these samples: s I1 = (u 1, u 2 : ˆt (1 = N n ( = 3 2 (0 + 1 = 3 2, µ(1 = 1 2. For the element u 3, we estimate ẑ 3 = 1, if a unit with an attribute is selected for the second-stage sample, and ẑ 3 = 0, otherwise. s I2 = (u 1, u 3 : if s II3 = {nonattr}, then ˆt (2 if s II3 = {attrib}, then ˆt (3 s I3 = (u 2, u 3 : if s II3 = {nonattr}, then ˆt (4 if s II3 = {attrib}, then ˆt (5 = N n ( 1 + ẑ 3 = 3 (0 + 0 = 0, µ(2 = 0, 2 = N n ( 1 + ẑ 3 = 3 2 (0 + 1 = 3 2, µ(3 = 1 2, = N n ( 2 + ẑ 3 = 3 2 (1 + 0 = 3 2, µ(4 = 1 2, = N n ( 2 + ẑ 3 = 3 (1 + 1 = 3, µ(5 = 1. 2 Let us calculate the expectation of ˆt with respect to the sampling design: Eˆt = ˆt (1 P(s (2 I1 + (ˆt P(nonattr u 3 + ˆt (3 P(attrib u 3 P(s I2 (ˆt (4 + P(nonattr u 3 + ˆt (5 P(attrib u 3 P(s I3 = 3 ( ( = = 3 2 t = 2. It means that the estimator ˆt is biased. Consequently, E µ = Eˆt N = 1 2 µ = 2 3, 476

5 Estimation of Some Proportion in a Clustered Population and the estimator µ of the proportion of the units with an attribute is also biased. It is clear by intuition that estimator (2 underestimates the true number of the units with an attribute, because the cases are possible, where a sampled unit based on the sampled subunits is classified as without an attribute (cases s I2, s I3 with s II3 = {nonattr}, while in reality there exists a non-sampled subunit with an attribute associated with it. On the other hand, there are no possible cases where a sampled unit is classified as being associated with the subunit with an attribute, as in reality it is not so. The situation is visualied in Fig. 1. Big circles mean units, all the small circles mean subunits; the small blac circles mean subunits with an attribute. A subunit joined with the unit means a sampled subunit. + means ẑ i = 1, means ẑ i = 0,? means ẑ i = 0 and a source of bias. Fig. 1 show that estimator (2 underestimates the number of units with an attribute. Fig. 1. Sample of subunits. Example 2 (Sie of a bias. Let us write an expression for the bias in some special case. Denote by X i the number of subunits with the attribute associated with the ith unit, and by Y i the number of sampled subunits with the attribute. Consider M i = M, m i = m, X i = > 0 are fixed numbers for all i = 1,...,N. It means that each population unit is associated with exactly subunits having attributes. The numbers of sampled subunits with the attribute Y i are independent identically distributed random variables. Suppose they have the same distribution as the random variable Y. For self-weighting 1st stage sampling design, the population sie can be expressed in such a way: N = t = NP(Y = 0 + NP(Y > 0. For the estimator ˆt in our example we have: Eˆt = E i s I d i ẑ i = NP(Y > 0. From the last two expressions we obtain Bias(ˆt = Eˆt t = NP(Y =

6 D. Krapavicaitė For our sampling design we calculate: Then P(Y = 0 = P(Y = 0 X = = C0 Cm M = Bias(ˆt = N ( 1 ( 1 M M 1 C m M... ( 1. M m + 1 ( 1 ( 1 ( M M 1 M m + 1 Some numerical values of the bias in the case the parameters close to the Lithuanian AES ones are given in Table 1. Table 1. Values of the Bias(ˆt for the case N = , M = 10, m = 3, = 1, 2,..., M Bias(ˆt `Bias(ˆt /N 100 (% , 9, We see that the higher the number of subunits with an attribute in the population, the lower the bias of estimator (2 is for the number of units associated with the subunits with an attribute. Bias is unavoidable when M 3. In order to adjust estimator to the bias, we introduce a superpopulation model for the distribution of the number of subunits with the attribute. 5 Alternative estimators We propose some design and model-based estimator for the proportion of the first-stage sampling elements associated with the subunits with an attribute under the two-stage sampling design. Some auxiliary assumptions on the superpopulation of subunits have to be stated. 1. Suppose that the number M i of subunits associated with the unit u i is fixed and nown, but the number of subunits X i with an attribute is random, 0 X i M i, i = 1,...,N. Let us define the probabilities p Mi ( = P(X i = M i, M i p Mi ( = 1, i = 1, 2,..., N. =0 478

7 Estimation of Some Proportion in a Clustered Population We consider these probabilities (distribution of the variable X i to be nown. 2. The values of the study variable become random because they depend on the values of the random variables X i, i = 1,...,N. The population total t is also random. 3. The number of sampled subunits with an attribute, Y i, is random, 0 Y i min(3, X i. The values Y i, Y j are independent for i j. For estimator (3 of the value i of the attribute indicator (study variable, the following relationship is valid: 1, if Y i > 0 { X i > 0, i = 1, ẑ i = X i > 0, i = 1, (5 0, if Y i = 0 X i = 0, i = 0. Conditional distribution of Y i under the condition that the value of X i is nown, is also nown due to the nown simple random sampling design of subunits. Taing into account (5, we obtain an expression for the probability, denoted by p i, that the variable i obtains value 1: p i = P( i = 1 = P(X i > 0, or, equivalently, p i = P(Y i > 0 + P(X i > 0 Y i = 0P(Y i = 0. (6 In order to obtain the new estimator of the total t, the value of ẑ i in (2 is changed by p i from (6. The expectation of t with respect to the distribution of X i, i = 1,...,N, is E X t = E X i = P(X i > 0 = p i. We are going to estimate this expectation. Estimator A. Let us estimate t by ˆt (A = i s I d i p i. (7 This is a Horvit-Thompson type estimator of the total of the study variable with the values p i, i = 1, 2,...,N, and we use further the well nown result [4, p. 43], for this estimator. Proposition A. Suppose the probabilities p Mi (, = 1, 2,...,M i, i = 1, 2,...,N, M i > 0, are fixed and nown. Then (i the estimator ˆt (A given in (7 is unbiased for E X t under the sampling design described in Section 3: Eˆt (A = p i, 479

8 D. Krapavicaitė (ii its variance V ar (A (ˆt = N 1 π i π i p 2 i + i,j=1 i j (iii the estimator of variance (A 1 π i V ar (ˆt = π 2 p 2 i + i s I i i,j s I i j (A is unbiased for V ar (ˆt. (π ij π i π j p ip j π i π j, π ij π i π j π ij p i π i p j π j Estimator B. Let us introduce a design and model-based estimator of a value i { 1, if Y i > 0, ẑ i = P(X i > 0 Y i = 0, if Y i = 0. and define a new estimator of the total t : ˆt (B = i s I d i ẑ i. (8 The probability P(X i > 0 Y i = 0 in the case Y i = 0 is included here in the definition of ẑ i in comparison to ẑ i in (3. Proposition B. Assume the distribution of random variables X i, i = 1, 2,..., N, to be nown, and M i, M i > 0 to be fixed and nown. Then (i the estimator ˆt (B, given in (8, is unbiased for E X t under the sampling design described in Section 3 and model: Eˆt (B = (ii its variance V ar (B (ˆt p i, (9 = V ar (ˆt (A N ( + d i pi p 2 i P(X i > 0 Y i = 0p Mi (0, (iii the suggested estimator of variance is (B V ar (ˆt = V ar (ˆt (A + i s I d 2 i ( pi p 2 i P(X i > 0 Y i = 0p Mi (0. The first term in the expression of V ar(ˆt (B is due to the sampling design, and second term is due to the distribution of X i, i = 1, 2,...,N. 480

9 Estimation of Some Proportion in a Clustered Population Remar 1. If X i is non-random, then { 1, if X i > 0, p i = 0, if X i = 0, and Eˆt (A = Eˆt (B = t is the number of population units associated with at least one subunit with an attribute. Remar 2. Calculation of the probability P(X i > 0 Y i = 0 used for V ar(ˆt (B for m = 3: P(X i > 0 Y i = 0P(Y i = 0 Hence, = = = M i 3 =1 M i 3 =1 M i 3 =1 P(X i > 0 Y i = 0 = P(X i = Y i = 0P(Y i = 0 P(Y i = 0 X i = P(X i = (1 Mi ( 1 1 P(Y i = 0 M i 3 =1 ( 1 M i 1 (1 Mi ( 1 p Mi (. (10 M i 2 ( 1 M i 1 p Mi (. M i 2 Estimator C. In practice, distribution of X i is not nown and it is estimated. Suppose that estimators p Mi ( are used for the probabilities p Mi (. Then we define P(X i > 0 Y i = 0 = 1 P(Y i = 0 M i 3 =1 (1 Mi ( 1 p i = P(Y i > 0 + P(X i > 0 Y i = 0. ( 1 M i 1 p Mi (, M i 2 Denote E bp (, V ar bp ( expectation and variance with respect to the distribution of the estimators p Mi (, = 1,..., M i, i = 1,..., N, { 1, if Y i > 0, ẑ i = P(X i > 0 Y i = 0, if Y i = 0, as well as the estimator of the total t ˆt (C = d i ẑ i. (11 i s I 481

10 D. Krapavicaitė Proposition C. Assume that the estimators p Mi ( are defined for the probabilities p Mi (, i = 1, 2,..., N, = 1,...,M i. Then (i the expectation ( under model, design and distribution of p Mi ( of estimator ˆt (C, given in (11, is Eˆt (C = E bp p i, (ii its variance is expressed by ( (C V ar (ˆt =V ar d i E bp p i + d i V ar bp ( p i i si + d i E bp ( pi p 2 i + P(Y i = 0 P(X i > 0 Y i = 0 ( P(Xi > 0 Y i = 0 1. The first term in the expression of variance is due to the sampling design, the third term is is due to the distribution of X i, i = 1, 2,..., N, and the second term is due to estimation of the superpopulation distribution probabilities. Remar 3. The situation can occur that the clusters of subunits are associated only with some, but not all the elements of the population. Then the number of units n in the sample associated with some subunits may be random, and n n. This invoes one more source of randomness in the estimators of the number of population units, associated with the subunits having attributes, which is not considered here. Estimation of the proportion. From the equalities µ = ˆt /N, V ar( µ = V ar(ˆt /N 2, V ar( µ = V ar(ˆt /N 2 we can obtain the estimator needed for a proportion, using any estimator of the total presented above. 6 Possible distributions of X i Example 3. Let any subunit attached to some unit have an equal probability p (0, 1 of bearing the attribute, and subunits have attributes independently of one another. Then the number of subunits X i is distributed according to the binomial distribution with the parameter p, and p Mi ( = P(X i = M i = C M i p (1 p Mi, = 0, 1,...,M i. 482

11 Estimation of Some Proportion in a Clustered Population Then p i = P(X i > 0 = 1 P(X i = 0 = 1 (1 p Mi. For some case of the binomial distribution of X i, the probabilities p Mi ( are given in Table 2. They have a pea for = 0 (0, M i with fixed M i. Table 2. Probabilities p Mi ( for the binomial distribution of X i and p = 0.6 M i Example 4. Each subunit associated with the ith population unit has an attribute with its own probability p (i,j, j = 1, 2,..., M i. Attributes are obtained by the subunits independently of one another. Then the probabilities needed are as follows: P(X i = 0 = ( 1 p (1,i( 1 p (2,i... ( 1 p (Mi,i, p i = 1 P(X i = 0 = 1 ( 1 p (1,i( 1 p (2,i... ( 1 p (Mi,i, p Mi (u = p (,i (1 p (,i. ω U 2i ω =u ω U 2i\ω Example 5. Let us suppose the superpopulation distribution of the number of subunits having attribute (variables X i, i = 1, 2,..., N to be nown, e. g. probabilities p i ( = p Mi ( = P(X i = M i, = 1, 2,...,M i, i = 1, 2,...,N, to be nown. We apply estimator B for the Small Population described in Example 1. Probabilities defining distribution of X 1, X 2, X 3 Model 1 are p 1 (0 = 1, p 1 (1 = 0, p 2 (0 = 0, p 2 (1 = 1, p 3 (0 = 0, p 3 (1 = 1, p 3 (2 = 0. Then we calculate according to (10 P(X 3 > 0 Y 3 = 0P(Y 3 = 0 = ( 1 1 p 3 (1 = (12 We can find easily P(Y 3 > 0 = 1/2. Hence, P(Y 3 = 0 = 1 P(Y 3 > 0 = 1/2. From (12 we obtain P(X 3 > 0 Y 3 = 0 = 1. Then we can calculate all the estimates: ˆt (B1 = 3 2, ˆt (B2 = 1, ˆt (B3 = 3 2, ˆt (B4 = 3, ˆt (B5 =

12 D. Krapavicaitė The average of the estimator (8 with respect to the design and model is Eˆt B = 1 3 (ˆt (B1 and we obtain Eˆt (B + ˆt (B2 P(Y 3 = 0 + ˆt (B3 P(Y 3 > 0 + ˆt (B4 P(Y 3 = 0 + ˆt (B5 P(Y 3 > 0 (13 On the other hand, we have that = 2. It means Eˆt (B = t, and unbiasedness of the estimator B. p 1 + p 2 + p 3 = P(X 1 > 0+P(X 2 > 0+ ( P(Y 3 > 0+P(X 3 > 0 Y 3 = 0P(Y 3 = 0 = /2 + 1/2 = 2 coincides with Eˆt (B, as it is said in Proposition B. Example 6. Let us suppose other models for superpopulation distribution of X i in Small Population of Example 1 for 0 < ε < 1. The results of estimation are presented in Table 3. Table 3. Results of the total estimation in the case of Small Population and various superpopulation models Model 2 Model 3 p 1(0 = 1, p 1(0 = 0, p 1(0 = 1, p 1(0 = 0, p 2(0 = 0, p 2(1 = 1, p 3(0 = ε, p 2(0 = 0, p 2(1 = 1, p 3(0 = ε 2, p 3(1 = 1 2ε, p 3(2 = ε p 3(1 = 2ε, p 3(2 = 1 2ε ε 2 P(Y 3 > 0 1/2 1 ε ε 2 P(Y 3 = 0 1/2 ε + ε 2 P(X 3 > 0 Y 3 = 0 1 2ε 1/(1 + ε ˆt (B1 3/2 3/2 ˆt (B2 3(1 2ε/2 3/`2(1 + ε ˆt (B3 ˆt (B4 ˆt (B5 3/2 3/2 3(1 ε 3`1 + 1/(1 + ε /2 3 3 Eˆt (B 2 ε 2 ε 2 Average of the estimator ˆt (B with respect to the design and model is calculated according to the formula (13. We see how average of the estimator ˆt (B depends on the probability of the unit to have at least one subunit with the attribute. We see also the expectation of the estimator ˆt (B for changed model assumptions (distribution of X 3. Model 3 shows distribution of X 3 for small ε of the type similar to the distribution in Lithuanian AES (compare Table 4, and this is, of course, non-linear function. 484

13 Estimation of Some Proportion in a Clustered Population Example 7. Let us try to find an approximation of the distribution of the number of the subunits with an attribute attached with some unit, which is met in the Lithuanian AES. n = individuals participated in non-formal education in the sample of the year 2007 of Lithuanian AES. The estimated probabilities p Mi ( = P(X i = M i are given in Table 4. They are increasing with an increase of for fixed M i and are far from those given in Table 2. Table 4. Relative frequencies bp Mi ( of the number of job-related learning activities in non-formal education in the Lithuanian AES M i The analytical expression of the function can be used for approximating the probabilities p Mi ( for real data: f(x = (1 + cx α, α > 0, c > 0, x 0. Choosing the proper parameters α, c we derive p Mi ( = P(X i = M i = (1 + c α Mi j=0 (1 + cjα, = 0, 1,...,M i. (14 The probabilities p Mi ( = P(X i = M i estimated, using this function with α = 6 and c = 2, are given in Table 5. They seem to be quite close to the values of the real survey in Table 4. Table 5. Estimated probabilities bp Mi ( = b P(X i = M i using the function proposed in (14 M i The empirical results show that probability approximations of the type (14 can be used in estimator C (11 for AES. We were successful in the Lithuanian AES of the year 2007: there are no cases in the sample with M i > 3 and Y i = 0. Anyway, there can be the case in subsequent survey and the estimators proposed may be needed. 485

14 D. Krapavicaitė 7 Conclusion The alternative estimators have been proposed for the proportion of the population units associated with at least one subunit with the attribute of interest, using the two-stage sampling design and assumptions on the superpopulation distribution of the number of subunits having the attribute. The Examples 1, 5 and 6 show that estimator B allows to obtain unbiased estimates to the problem. The success of usage of the estimators proposed depends on nowledge of the distribution of the number of subunits with the attribute associated with the population units. Acnowledgements The author is thanful to Statistics Lithuania for a possibility to analye the survey data and to anonymous referees for useful comments. Appendix Proof of the Proposition B. The expectation of the estimator under the design and model is as follows: Eˆt (B = E i s I d i E(ẑ i s I = E i s I d i ( P(Yi > 0 + P(X i > 0 Y i = 0P(Y i = 0 = E i s I d i p i = p i. The variance is calculated taing into account that sampling designs at both stages are independent, and Y i, i s I, are independent random variables: ( (B V ar (ˆt = V ar E(ˆt (B s I + E ( V ar(ˆt (B s I ( ( = V ar d i E(ẑ i s I + E d 2 i V ar(ẑ i s I. i s I i s I Hence, V ar (B (ˆt = V ar (ˆt (A N + d i V ar(ẑ i. (15 For V ar(ẑ i = Eẑ 2 i (Eẑ i 2, we find: Eẑ i = P(Y i > 0 + P(X i > 0 Y i = 0P(Y i = 0 = p i, 486

15 Estimation of Some Proportion in a Clustered Population Eẑ 2 i = P(Y i > 0 + P(X i > 0 Y i = 0 2 P(Y i = 0 = p i + P(X i > 0 Y i = 0P(Y i = 0 ( P(X i > 0 Y i = 0 1 = p i P(X i > 0 Y i = 0P(Y i = 0P(X i = 0 Y i = 0 = p i P(X i > 0 Y i = 0P(X i = 0 = p i P(X i > 0 Y i = 0p Mi (0. Hence it follows that V ar(ẑ i = p i p 2 i P(X i > 0 Y i = 0p Mi (0. By substituting V ar(ẑ i in (15, we obtain the expression of variance. The estimator of variance is obtained using the expression of (iii Proposition A for the first term and the unbiased Horvit-Thompson estimator of the total for the second term of the variance. References 1. Eurostat, Adult Education survey AES, cache/ity SDDS/EN/ trng aes base.htm access/. 2. Eurostat, CIRCA a collaborative worspace with partners of the European Institutions, 3. Eurostat, Database, pa geid=1996, & dad=portal& schema=portal &screen=welcomeref &open=/edtr/trng/trng aes&language=en&product=eu MASTER education training&root=eu MASTER education training&scrollto=0. 4. C.-E. Särndal, B. Swensson, J. Wretman, Model Assisted Survey Sampling, New Yor, Springer,

NONLINEAR CALIBRATION. 1 Introduction. 2 Calibrated estimator of total. Abstract

NONLINEAR CALIBRATION. 1 Introduction. 2 Calibrated estimator of total.   Abstract NONLINEAR CALIBRATION 1 Alesandras Pliusas 1 Statistics Lithuania, Institute of Mathematics and Informatics, Lithuania e-mail: Pliusas@tl.mii.lt Abstract The definition of a calibrated estimator of the

More information

arxiv: v2 [math.st] 20 Jun 2014

arxiv: v2 [math.st] 20 Jun 2014 A solution in small area estimation problems Andrius Čiginas and Tomas Rudys Vilnius University Institute of Mathematics and Informatics, LT-08663 Vilnius, Lithuania arxiv:1306.2814v2 [math.st] 20 Jun

More information

A comparison of stratified simple random sampling and sampling with probability proportional to size

A comparison of stratified simple random sampling and sampling with probability proportional to size A comparison of stratified simple random sampling and sampling with probability proportional to size Edgar Bueno Dan Hedlin Per Gösta Andersson 1 Introduction When planning the sampling strategy (i.e.

More information

Model Assisted Survey Sampling

Model Assisted Survey Sampling Carl-Erik Sarndal Jan Wretman Bengt Swensson Model Assisted Survey Sampling Springer Preface v PARTI Principles of Estimation for Finite Populations and Important Sampling Designs CHAPTER 1 Survey Sampling

More information

Comments on Design-Based Prediction Using Auxilliary Information under Random Permutation Models (by Wenjun Li (5/21/03) Ed Stanek

Comments on Design-Based Prediction Using Auxilliary Information under Random Permutation Models (by Wenjun Li (5/21/03) Ed Stanek Comments on Design-Based Prediction Using Auxilliary Information under Random Permutation Models (by Wenjun Li (5/2/03) Ed Stanek Here are comments on the Draft Manuscript. They are all suggestions that

More information

Optimal Auxiliary Variable Assisted Two-Phase Sampling Designs

Optimal Auxiliary Variable Assisted Two-Phase Sampling Designs MASTER S THESIS Optimal Auxiliary Variable Assisted Two-Phase Sampling Designs HENRIK IMBERG Department of Mathematical Sciences Division of Mathematical Statistics CHALMERS UNIVERSITY OF TECHNOLOGY UNIVERSITY

More information

Indirect Sampling in Case of Asymmetrical Link Structures

Indirect Sampling in Case of Asymmetrical Link Structures Indirect Sampling in Case of Asymmetrical Link Structures Torsten Harms Abstract Estimation in case of indirect sampling as introduced by Huang (1984) and Ernst (1989) and developed by Lavalle (1995) Deville

More information

Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities

Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Peter M. Aronow and Cyrus Samii Forthcoming at Survey Methodology Abstract We consider conservative variance

More information

Mean estimation with calibration techniques in presence of missing data

Mean estimation with calibration techniques in presence of missing data Computational Statistics & Data Analysis 50 2006 3263 3277 www.elsevier.com/locate/csda Mean estimation with calibration techniues in presence of missing data M. Rueda a,, S. Martínez b, H. Martínez c,

More information

BIAS-ROBUSTNESS AND EFFICIENCY OF MODEL-BASED INFERENCE IN SURVEY SAMPLING

BIAS-ROBUSTNESS AND EFFICIENCY OF MODEL-BASED INFERENCE IN SURVEY SAMPLING Statistica Sinica 22 (2012), 777-794 doi:http://dx.doi.org/10.5705/ss.2010.238 BIAS-ROBUSTNESS AND EFFICIENCY OF MODEL-BASED INFERENCE IN SURVEY SAMPLING Desislava Nedyalova and Yves Tillé University of

More information

A two-phase sampling scheme and πps designs

A two-phase sampling scheme and πps designs p. 1/2 A two-phase sampling scheme and πps designs Thomas Laitila 1 and Jens Olofsson 2 thomas.laitila@esi.oru.se and jens.olofsson@oru.se 1 Statistics Sweden and Örebro University, Sweden 2 Örebro University,

More information

Bias Correction in the Balanced-half-sample Method if the Number of Sampled Units in Some Strata Is Odd

Bias Correction in the Balanced-half-sample Method if the Number of Sampled Units in Some Strata Is Odd Journal of Of cial Statistics, Vol. 14, No. 2, 1998, pp. 181±188 Bias Correction in the Balanced-half-sample Method if the Number of Sampled Units in Some Strata Is Odd Ger T. Slootbee 1 The balanced-half-sample

More information

Random convolution of O-exponential distributions

Random convolution of O-exponential distributions Nonlinear Analysis: Modelling and Control, Vol. 20, No. 3, 447 454 ISSN 1392-5113 http://dx.doi.org/10.15388/na.2015.3.9 Random convolution of O-exponential distributions Svetlana Danilenko a, Jonas Šiaulys

More information

ESTIMATION OF CONFIDENCE INTERVALS FOR QUANTILES IN A FINITE POPULATION

ESTIMATION OF CONFIDENCE INTERVALS FOR QUANTILES IN A FINITE POPULATION Mathematical Modelling and Analysis Volume 13 Number 2, 2008, pages 195 202 c 2008 Technika ISSN 1392-6292 print, ISSN 1648-3510 online ESTIMATION OF CONFIDENCE INTERVALS FOR QUANTILES IN A FINITE POPULATION

More information

The Finite Sample Properties of the Least Squares Estimator / Basic Hypothesis Testing

The Finite Sample Properties of the Least Squares Estimator / Basic Hypothesis Testing 1 The Finite Sample Properties of the Least Squares Estimator / Basic Hypothesis Testing Greene Ch 4, Kennedy Ch. R script mod1s3 To assess the quality and appropriateness of econometric estimators, we

More information

Machine Learning. Lecture 9: Learning Theory. Feng Li.

Machine Learning. Lecture 9: Learning Theory. Feng Li. Machine Learning Lecture 9: Learning Theory Feng Li fli@sdu.edu.cn https://funglee.github.io School of Computer Science and Technology Shandong University Fall 2018 Why Learning Theory How can we tell

More information

Chapter 8: Estimation 1

Chapter 8: Estimation 1 Chapter 8: Estimation 1 Jae-Kwang Kim Iowa State University Fall, 2014 Kim (ISU) Ch. 8: Estimation 1 Fall, 2014 1 / 33 Introduction 1 Introduction 2 Ratio estimation 3 Regression estimator Kim (ISU) Ch.

More information

Weak max-sum equivalence for dependent heavy-tailed random variables

Weak max-sum equivalence for dependent heavy-tailed random variables DOI 10.1007/s10986-016-9303-6 Lithuanian Mathematical Journal, Vol. 56, No. 1, January, 2016, pp. 49 59 Wea max-sum equivalence for dependent heavy-tailed random variables Lina Dindienė a and Remigijus

More information

Combine Monte Carlo with Exhaustive Search: Effective Variational Inference and Policy Gradient Reinforcement Learning

Combine Monte Carlo with Exhaustive Search: Effective Variational Inference and Policy Gradient Reinforcement Learning Combine Monte Carlo with Exhaustive Search: Effective Variational Inference and Policy Gradient Reinforcement Learning Michalis K. Titsias Department of Informatics Athens University of Economics and Business

More information

PROBABILITY DISTRIBUTION

PROBABILITY DISTRIBUTION PROBABILITY DISTRIBUTION DEFINITION: If S is a sample space with a probability measure and x is a real valued function defined over the elements of S, then x is called a random variable. Types of Random

More information

A MODEL-BASED EVALUATION OF SEVERAL WELL-KNOWN VARIANCE ESTIMATORS FOR THE COMBINED RATIO ESTIMATOR

A MODEL-BASED EVALUATION OF SEVERAL WELL-KNOWN VARIANCE ESTIMATORS FOR THE COMBINED RATIO ESTIMATOR Statistica Sinica 8(1998), 1165-1173 A MODEL-BASED EVALUATION OF SEVERAL WELL-KNOWN VARIANCE ESTIMATORS FOR THE COMBINED RATIO ESTIMATOR Phillip S. Kott National Agricultural Statistics Service Abstract:

More information

Adaptive two-stage sequential double sampling

Adaptive two-stage sequential double sampling Adaptive two-stage sequential double sampling Bardia Panahbehagh Afshin Parvardeh Babak Mohammadi March 4, 208 arxiv:803.04484v [math.st] 2 Mar 208 Abstract In many surveys inexpensive auxiliary variables

More information

Reliability of inference (1 of 2 lectures)

Reliability of inference (1 of 2 lectures) Reliability of inference (1 of 2 lectures) Ragnar Nymoen University of Oslo 5 March 2013 1 / 19 This lecture (#13 and 14): I The optimality of the OLS estimators and tests depend on the assumptions of

More information

DS-GA 1002 Lecture notes 2 Fall Random variables

DS-GA 1002 Lecture notes 2 Fall Random variables DS-GA 12 Lecture notes 2 Fall 216 1 Introduction Random variables Random variables are a fundamental tool in probabilistic modeling. They allow us to model numerical quantities that are uncertain: the

More information

Lecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions

Lecture 4: Probabilistic Learning. Estimation Theory. Classification with Probability Distributions DD2431 Autumn, 2014 1 2 3 Classification with Probability Distributions Estimation Theory Classification in the last lecture we assumed we new: P(y) Prior P(x y) Lielihood x2 x features y {ω 1,..., ω K

More information

STA304H1F/1003HF Summer 2015: Lecture 11

STA304H1F/1003HF Summer 2015: Lecture 11 STA304H1F/1003HF Summer 2015: Lecture 11 You should know... What is one-stage vs two-stage cluster sampling? What are primary and secondary sampling units? What are the two types of estimation in cluster

More information

Exact balanced random imputation for sample survey data

Exact balanced random imputation for sample survey data Exact balanced random imputation for sample survey data Guillaume Chauvet, Wilfried Do Paco To cite this version: Guillaume Chauvet, Wilfried Do Paco. Exact balanced random imputation for sample survey

More information

Discrete uniform limit law for additive functions on shifted primes

Discrete uniform limit law for additive functions on shifted primes Nonlinear Analysis: Modelling and Control, Vol. 2, No. 4, 437 447 ISSN 392-53 http://dx.doi.org/0.5388/na.206.4. Discrete uniform it law for additive functions on shifted primes Gediminas Stepanauskas,

More information

The 2-valued case of makespan minimization with assignment constraints

The 2-valued case of makespan minimization with assignment constraints The 2-valued case of maespan minimization with assignment constraints Stavros G. Kolliopoulos Yannis Moysoglou Abstract We consider the following special case of minimizing maespan. A set of jobs J and

More information

Three coefficients of a polynomial can determine its instability

Three coefficients of a polynomial can determine its instability Linear Algebra and its Applications 338 (2001) 67 76 www.elsevier.com/locate/laa Three coefficients of a polynomial can determine its instability Alberto Borobia a,,1, Sebastián Dormido b,2 a Departamento

More information

Cluster Sampling 2. Chapter Introduction

Cluster Sampling 2. Chapter Introduction Chapter 7 Cluster Sampling 7.1 Introduction In this chapter, we consider two-stage cluster sampling where the sample clusters are selected in the first stage and the sample elements are selected in the

More information

Machine Learning

Machine Learning Machine Learning 10-701 Tom M. Mitchell Machine Learning Department Carnegie Mellon University February 1, 2011 Today: Generative discriminative classifiers Linear regression Decomposition of error into

More information

Lecture 4: Probability and Discrete Random Variables

Lecture 4: Probability and Discrete Random Variables Error Correcting Codes: Combinatorics, Algorithms and Applications (Fall 2007) Lecture 4: Probability and Discrete Random Variables Wednesday, January 21, 2009 Lecturer: Atri Rudra Scribe: Anonymous 1

More information

Generalized Akiyama-Tanigawa Algorithm for Hypersums of Powers of Integers

Generalized Akiyama-Tanigawa Algorithm for Hypersums of Powers of Integers 1 2 3 47 6 23 11 Journal of Integer Sequences, Vol 16 (2013, Article 1332 Generalized Aiyama-Tanigawa Algorithm for Hypersums of Powers of Integers José Luis Cereceda Distrito Telefónica, Edificio Este

More information

Machine Learning

Machine Learning Machine Learning 10-601 Tom M. Mitchell Machine Learning Department Carnegie Mellon University February 4, 2015 Today: Generative discriminative classifiers Linear regression Decomposition of error into

More information

Bias-Variance Tradeoff

Bias-Variance Tradeoff What s learning, revisited Overfitting Generative versus Discriminative Logistic Regression Machine Learning 10701/15781 Carlos Guestrin Carnegie Mellon University September 19 th, 2007 Bias-Variance Tradeoff

More information

arxiv: v1 [math.gr] 16 Dec 2012

arxiv: v1 [math.gr] 16 Dec 2012 arxiv:1212.3740v1 [math.gr] 16 Dec 2012 Linear non-homogenous patterns and prime power generators in numerical semigroups associated to combinatorial configurations Klara Stoes and Maria Bras-Amorós October

More information

Development of methodology for the estimate of variance of annual net changes for LFS-based indicators

Development of methodology for the estimate of variance of annual net changes for LFS-based indicators Development of methodology for the estimate of variance of annual net changes for LFS-based indicators Deliverable 1 - Short document with derivation of the methodology (FINAL) Contract number: Subject:

More information

Nonparametric Regression Estimation of Finite Population Totals under Two-Stage Sampling

Nonparametric Regression Estimation of Finite Population Totals under Two-Stage Sampling Nonparametric Regression Estimation of Finite Population Totals under Two-Stage Sampling Ji-Yeon Kim Iowa State University F. Jay Breidt Colorado State University Jean D. Opsomer Colorado State University

More information

Outline. Possible Reasons. Nature of Heteroscedasticity. Basic Econometrics in Transportation. Heteroscedasticity

Outline. Possible Reasons. Nature of Heteroscedasticity. Basic Econometrics in Transportation. Heteroscedasticity 1/25 Outline Basic Econometrics in Transportation Heteroscedasticity What is the nature of heteroscedasticity? What are its consequences? How does one detect it? What are the remedial measures? Amir Samimi

More information

Lecture 6 Random walks - advanced methods

Lecture 6 Random walks - advanced methods Lecture 6: Random wals - advanced methods 1 of 11 Course: M362K Intro to Stochastic Processes Term: Fall 2014 Instructor: Gordan Zitovic Lecture 6 Random wals - advanced methods STOPPING TIMES Our last

More information

Cross-sectional variance estimation for the French Labour Force Survey

Cross-sectional variance estimation for the French Labour Force Survey Survey Research Methods (007 Vol., o., pp. 75-83 ISS 864-336 http://www.surveymethods.org c European Survey Research Association Cross-sectional variance estimation for the French Labour Force Survey Pascal

More information

ECNS 561 Multiple Regression Analysis

ECNS 561 Multiple Regression Analysis ECNS 561 Multiple Regression Analysis Model with Two Independent Variables Consider the following model Crime i = β 0 + β 1 Educ i + β 2 [what else would we like to control for?] + ε i Here, we are taking

More information

Introduction to Machine Learning Spring 2018 Note 18

Introduction to Machine Learning Spring 2018 Note 18 CS 189 Introduction to Machine Learning Spring 2018 Note 18 1 Gaussian Discriminant Analysis Recall the idea of generative models: we classify an arbitrary datapoint x with the class label that maximizes

More information

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering

9/12/17. Types of learning. Modeling data. Supervised learning: Classification. Supervised learning: Regression. Unsupervised learning: Clustering Types of learning Modeling data Supervised: we know input and targets Goal is to learn a model that, given input data, accurately predicts target data Unsupervised: we know the input only and want to make

More information

a b = a T b = a i b i (1) i=1 (Geometric definition) The dot product of two Euclidean vectors a and b is defined by a b = a b cos(θ a,b ) (2)

a b = a T b = a i b i (1) i=1 (Geometric definition) The dot product of two Euclidean vectors a and b is defined by a b = a b cos(θ a,b ) (2) This is my preperation notes for teaching in sections during the winter 2018 quarter for course CSE 446. Useful for myself to review the concepts as well. More Linear Algebra Definition 1.1 (Dot Product).

More information

Admissible Estimation of a Finite Population Total under PPS Sampling

Admissible Estimation of a Finite Population Total under PPS Sampling Research Journal of Mathematical and Statistical Sciences E-ISSN 2320-6047 Admissible Estimation of a Finite Population Total under PPS Sampling Abstract P.A. Patel 1* and Shradha Bhatt 2 1 Department

More information

Statistical inference

Statistical inference Statistical inference Contents 1. Main definitions 2. Estimation 3. Testing L. Trapani MSc Induction - Statistical inference 1 1 Introduction: definition and preliminary theory In this chapter, we shall

More information

Expect Values and Probability Density Functions

Expect Values and Probability Density Functions Intelligent Systems: Reasoning and Recognition James L. Crowley ESIAG / osig Second Semester 00/0 Lesson 5 8 april 0 Expect Values and Probability Density Functions otation... Bayesian Classification (Reminder...3

More information

A necessary and sufficient condition for the existence of a spanning tree with specified vertices having large degrees

A necessary and sufficient condition for the existence of a spanning tree with specified vertices having large degrees A necessary and sufficient condition for the existence of a spanning tree with specified vertices having large degrees Yoshimi Egawa Department of Mathematical Information Science, Tokyo University of

More information

1 of 6 7/16/2009 6:31 AM Virtual Laboratories > 11. Bernoulli Trials > 1 2 3 4 5 6 1. Introduction Basic Theory The Bernoulli trials process, named after James Bernoulli, is one of the simplest yet most

More information

Review of Maximum Likelihood Estimators

Review of Maximum Likelihood Estimators Libby MacKinnon CSE 527 notes Lecture 7, October 7, 2007 MLE and EM Review of Maximum Likelihood Estimators MLE is one of many approaches to parameter estimation. The likelihood of independent observations

More information

A Note on the Effect of Auxiliary Information on the Variance of Cluster Sampling

A Note on the Effect of Auxiliary Information on the Variance of Cluster Sampling Journal of Official Statistics, Vol. 25, No. 3, 2009, pp. 397 404 A Note on the Effect of Auxiliary Information on the Variance of Cluster Sampling Nina Hagesæther 1 and Li-Chun Zhang 1 A model-based synthesis

More information

of being selected and varying such probability across strata under optimal allocation leads to increased accuracy.

of being selected and varying such probability across strata under optimal allocation leads to increased accuracy. 5 Sampling with Unequal Probabilities Simple random sampling and systematic sampling are schemes where every unit in the population has the same chance of being selected We will now consider unequal probability

More information

LOCUS. Definition: The set of all points (and only those points) which satisfy the given geometrical condition(s) (or properties) is called a locus.

LOCUS. Definition: The set of all points (and only those points) which satisfy the given geometrical condition(s) (or properties) is called a locus. LOCUS Definition: The set of all points (and only those points) which satisfy the given geometrical condition(s) (or properties) is called a locus. Eg. The set of points in a plane which are at a constant

More information

Multiple-source cross-validation

Multiple-source cross-validation rzysztof J. Geras Charles Sutton School of Informatics, University of dinburgh.j.geras@sms.ed.ac.u csutton@inf.ed.ac.u Abstract Cross-validation is an essential tool in machine learning and statistics.

More information

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008

Gaussian processes. Chuong B. Do (updated by Honglak Lee) November 22, 2008 Gaussian processes Chuong B Do (updated by Honglak Lee) November 22, 2008 Many of the classical machine learning algorithms that we talked about during the first half of this course fit the following pattern:

More information

Clustering by Mixture Models. General background on clustering Example method: k-means Mixture model based clustering Model estimation

Clustering by Mixture Models. General background on clustering Example method: k-means Mixture model based clustering Model estimation Clustering by Mixture Models General bacground on clustering Example method: -means Mixture model based clustering Model estimation 1 Clustering A basic tool in data mining/pattern recognition: Divide

More information

More than one variable

More than one variable Chapter More than one variable.1 Bivariate discrete distributions Suppose that the r.v. s X and Y are discrete and take on the values x j and y j, j 1, respectively. Then the joint p.d.f. of X and Y, to

More information

A General Overview of Parametric Estimation and Inference Techniques.

A General Overview of Parametric Estimation and Inference Techniques. A General Overview of Parametric Estimation and Inference Techniques. Moulinath Banerjee University of Michigan September 11, 2012 The object of statistical inference is to glean information about an underlying

More information

AMCS243/CS243/EE243 Probability and Statistics. Fall Final Exam: Sunday Dec. 8, 3:00pm- 5:50pm VERSION A

AMCS243/CS243/EE243 Probability and Statistics. Fall Final Exam: Sunday Dec. 8, 3:00pm- 5:50pm VERSION A AMCS243/CS243/EE243 Probability and Statistics Fall 2013 Final Exam: Sunday Dec. 8, 3:00pm- 5:50pm VERSION A *********************************************************** ID: ***********************************************************

More information

One-phase estimation techniques

One-phase estimation techniques One-phase estimation techniques Based on Horwitz-Thompson theorem for continuous populations Radim Adolt ÚHÚL Brandýs nad Labem, Czech Republic USEWOOD WG2, Training school in Dublin, 16.-19. September

More information

Intermediate Econometrics

Intermediate Econometrics Intermediate Econometrics Heteroskedasticity Text: Wooldridge, 8 July 17, 2011 Heteroskedasticity Assumption of homoskedasticity, Var(u i x i1,..., x ik ) = E(u 2 i x i1,..., x ik ) = σ 2. That is, the

More information

Parametric Models: from data to models

Parametric Models: from data to models Parametric Models: from data to models Pradeep Ravikumar Co-instructor: Manuela Veloso Machine Learning 10-701 Jan 22, 2018 Recall: Model-based ML DATA MODEL LEARNING MODEL MODEL INFERENCE KNOWLEDGE Learning:

More information

Introduction to Bayesian Learning. Machine Learning Fall 2018

Introduction to Bayesian Learning. Machine Learning Fall 2018 Introduction to Bayesian Learning Machine Learning Fall 2018 1 What we have seen so far What does it mean to learn? Mistake-driven learning Learning by counting (and bounding) number of mistakes PAC learnability

More information

2 Generating Functions

2 Generating Functions 2 Generating Functions In this part of the course, we re going to introduce algebraic methods for counting and proving combinatorial identities. This is often greatly advantageous over the method of finding

More information

New Sampling Design For The Swiss Job Statistics

New Sampling Design For The Swiss Job Statistics New Sampling Design For The Swiss Job Statistics Jean-Marc Nicoletti, Daniel Assoulin 1 Abstract In 2015 different aspects of the Swiss Job Statistics, JobStat, have been revised. One revision-point concerned

More information

Probability Review. AP Statistics April 25, Dr. John Holcomb, Cleveland State University

Probability Review. AP Statistics April 25, Dr. John Holcomb, Cleveland State University Probability Review AP Statistics April 25, 2015 Dr. John Holcomb, Cleveland State University PROBLEM 1 The data below comes from a nutritional study conducted at Ohio University. Eighty three subjects

More information

An Improved Bound for Minimizing the Total Weighted Completion Time of Coflows in Datacenters

An Improved Bound for Minimizing the Total Weighted Completion Time of Coflows in Datacenters IEEE/ACM TRANSACTIONS ON NETWORKING An Improved Bound for Minimizing the Total Weighted Completion Time of Coflows in Datacenters Mehrnoosh Shafiee, Student Member, IEEE, and Javad Ghaderi, Member, IEEE

More information

arxiv: v2 [math.pr] 7 Oct 2016

arxiv: v2 [math.pr] 7 Oct 2016 Multilevel branching splitting algorithm for estimating rare event probabilities arxiv:602.02893v2 [math.pr] 7 Oct 206 October, 208 Agnès Lagnoux and Pascal Lezaud Abstract: We analyse the splitting algorithm

More information

Wavelet Transform And Principal Component Analysis Based Feature Extraction

Wavelet Transform And Principal Component Analysis Based Feature Extraction Wavelet Transform And Principal Component Analysis Based Feature Extraction Keyun Tong June 3, 2010 As the amount of information grows rapidly and widely, feature extraction become an indispensable technique

More information

AN ASYMPTOTICALLY UNBIASED MOMENT ESTIMATOR OF A NEGATIVE EXTREME VALUE INDEX. Departamento de Matemática. Abstract

AN ASYMPTOTICALLY UNBIASED MOMENT ESTIMATOR OF A NEGATIVE EXTREME VALUE INDEX. Departamento de Matemática. Abstract AN ASYMPTOTICALLY UNBIASED ENT ESTIMATOR OF A NEGATIVE EXTREME VALUE INDEX Frederico Caeiro Departamento de Matemática Faculdade de Ciências e Tecnologia Universidade Nova de Lisboa 2829 516 Caparica,

More information

Measurable Choice Functions

Measurable Choice Functions (January 19, 2013) Measurable Choice Functions Paul Garrett garrett@math.umn.edu http://www.math.umn.edu/ garrett/ [This document is http://www.math.umn.edu/ garrett/m/fun/choice functions.pdf] This note

More information

MATHEMATICAL ENGINEERING TECHNICAL REPORTS

MATHEMATICAL ENGINEERING TECHNICAL REPORTS MATHEMATICAL ENGINEERING TECHNICAL REPORTS Combinatorial Relaxation Algorithm for the Entire Sequence of the Maximum Degree of Minors in Mixed Polynomial Matrices Shun SATO (Communicated by Taayasu MATSUO)

More information

Practice Problems Section Problems

Practice Problems Section Problems Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,

More information

Recognizing Tautology by a Deterministic Algorithm Whose While-loop s Execution Time Is Bounded by Forcing. Toshio Suzuki

Recognizing Tautology by a Deterministic Algorithm Whose While-loop s Execution Time Is Bounded by Forcing. Toshio Suzuki Recognizing Tautology by a Deterministic Algorithm Whose While-loop s Execution Time Is Bounded by Forcing Toshio Suzui Osaa Prefecture University Saai, Osaa 599-8531, Japan suzui@mi.cias.osaafu-u.ac.jp

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth

More information

Problem Set 2. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 30

Problem Set 2. MAS 622J/1.126J: Pattern Recognition and Analysis. Due: 5:00 p.m. on September 30 Problem Set 2 MAS 622J/1.126J: Pattern Recognition and Analysis Due: 5:00 p.m. on September 30 [Note: All instructions to plot data or write a program should be carried out using Matlab. In order to maintain

More information

Recap of Basic Probability Theory

Recap of Basic Probability Theory 02407 Stochastic Processes Recap of Basic Probability Theory Uffe Høgsbro Thygesen Informatics and Mathematical Modelling Technical University of Denmark 2800 Kgs. Lyngby Denmark Email: uht@imm.dtu.dk

More information

1 EM algorithm: updating the mixing proportions {π k } ik are the posterior probabilities at the qth iteration of EM.

1 EM algorithm: updating the mixing proportions {π k } ik are the posterior probabilities at the qth iteration of EM. Université du Sud Toulon - Var Master Informatique Probabilistic Learning and Data Analysis TD: Model-based clustering by Faicel CHAMROUKHI Solution The aim of this practical wor is to show how the Classification

More information

EQUIVALENCE OF TOPOLOGIES AND BOREL FIELDS FOR COUNTABLY-HILBERT SPACES

EQUIVALENCE OF TOPOLOGIES AND BOREL FIELDS FOR COUNTABLY-HILBERT SPACES EQUIVALENCE OF TOPOLOGIES AND BOREL FIELDS FOR COUNTABLY-HILBERT SPACES JEREMY J. BECNEL Abstract. We examine the main topologies wea, strong, and inductive placed on the dual of a countably-normed space

More information

Finite Population Sampling and Inference

Finite Population Sampling and Inference Finite Population Sampling and Inference A Prediction Approach RICHARD VALLIANT ALAN H. DORFMAN RICHARD M. ROYALL A Wiley-Interscience Publication JOHN WILEY & SONS, INC. New York Chichester Weinheim Brisbane

More information

Recap of Basic Probability Theory

Recap of Basic Probability Theory 02407 Stochastic Processes? Recap of Basic Probability Theory Uffe Høgsbro Thygesen Informatics and Mathematical Modelling Technical University of Denmark 2800 Kgs. Lyngby Denmark Email: uht@imm.dtu.dk

More information

Economics 620, Lecture 4: The K-Variable Linear Model I. y 1 = + x 1 + " 1 y 2 = + x 2 + " 2 :::::::: :::::::: y N = + x N + " N

Economics 620, Lecture 4: The K-Variable Linear Model I. y 1 = + x 1 +  1 y 2 = + x 2 +  2 :::::::: :::::::: y N = + x N +  N 1 Economics 620, Lecture 4: The K-Variable Linear Model I Consider the system y 1 = + x 1 + " 1 y 2 = + x 2 + " 2 :::::::: :::::::: y N = + x N + " N or in matrix form y = X + " where y is N 1, X is N

More information

6. Fractional Imputation in Survey Sampling

6. Fractional Imputation in Survey Sampling 6. Fractional Imputation in Survey Sampling 1 Introduction Consider a finite population of N units identified by a set of indices U = {1, 2,, N} with N known. Associated with each unit i in the population

More information

10-701/15-781, Machine Learning: Homework 4

10-701/15-781, Machine Learning: Homework 4 10-701/15-781, Machine Learning: Homewor 4 Aarti Singh Carnegie Mellon University ˆ The assignment is due at 10:30 am beginning of class on Mon, Nov 15, 2010. ˆ Separate you answers into five parts, one

More information

Analysis of Variance (ANOVA)

Analysis of Variance (ANOVA) Analysis of Variance ANOVA) Compare several means Radu Trîmbiţaş 1 Analysis of Variance for a One-Way Layout 1.1 One-way ANOVA Analysis of Variance for a One-Way Layout procedure for one-way layout Suppose

More information

CS 543 Page 1 John E. Boon, Jr.

CS 543 Page 1 John E. Boon, Jr. CS 543 Machine Learning Spring 2010 Lecture 05 Evaluating Hypotheses I. Overview A. Given observed accuracy of a hypothesis over a limited sample of data, how well does this estimate its accuracy over

More information

Introduction to Econometrics

Introduction to Econometrics Introduction to Econometrics Lecture 2 : Causal Inference and Random Control Trails(RCT) Zhaopeng Qu Business School,Nanjing University Sep 18th, 2017 Zhaopeng Qu (Nanjing University) Introduction to Econometrics

More information

Lecture 7 Introduction to Statistical Decision Theory

Lecture 7 Introduction to Statistical Decision Theory Lecture 7 Introduction to Statistical Decision Theory I-Hsiang Wang Department of Electrical Engineering National Taiwan University ihwang@ntu.edu.tw December 20, 2016 1 / 55 I-Hsiang Wang IT Lecture 7

More information

arxiv: v2 [stat.ap] 9 Jun 2018

arxiv: v2 [stat.ap] 9 Jun 2018 Estimating Shortest Path Length Distributions via Random Walk Sampling Minhui Zheng and Bruce D. Spencer arxiv:1806.01757v2 [stat.ap] 9 Jun 2018 Department of Statistics, Northwestern University June 12,

More information

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University Why uncertainty? Why should data mining care about uncertainty? We

More information

What level should we use in small area estimation?

What level should we use in small area estimation? University of Wollongong Research Online University of Wollongong Thesis Collection University of Wollongong Thesis Collections 2011 What level should we use in small area estimation? Mohammad Reza Namazi

More information

a zoo of (discrete) random variables

a zoo of (discrete) random variables discrete uniform random variables A discrete random variable X equally liely to tae any (integer) value between integers a and b, inclusive, is uniform. Notation: X ~ Unif(a,b) a zoo of (discrete) random

More information

Using Instrumental Variables to Find Causal Effects in Public Health

Using Instrumental Variables to Find Causal Effects in Public Health 1 Using Instrumental Variables to Find Causal Effects in Public Health Antonio Trujillo, PhD John Hopkins Bloomberg School of Public Health Department of International Health Health Systems Program October

More information

Generalized Mack Chain-Ladder Model of Reserving with Robust Estimation

Generalized Mack Chain-Ladder Model of Reserving with Robust Estimation Generalized Mac Chain-Ladder Model of Reserving with Robust Estimation Przemyslaw Sloma Abstract n the present paper we consider the problem of stochastic claims reserving in the framewor of Development

More information

Perfect and Imperfect Competition in Electricity Markets

Perfect and Imperfect Competition in Electricity Markets Perfect and Imperfect Competition in Electricity Marets DTU CEE Summer School 2018 June 25-29, 2018 Contact: Vladimir Dvorin (vladvo@eletro.dtu.d) Jalal Kazempour (seyaz@eletro.dtu.d) Deadline: August

More information

Convergence rates for distributed stochastic optimization over random networks

Convergence rates for distributed stochastic optimization over random networks Convergence rates for distributed stochastic optimization over random networs Dusan Jaovetic, Dragana Bajovic, Anit Kumar Sahu and Soummya Kar Abstract We establish the O ) convergence rate for distributed

More information

Monte Carlo Simulations

Monte Carlo Simulations Monte Carlo Simulations What are Monte Carlo Simulations and why ones them? Pseudo Random Number generators Creating a realization of a general PDF The Bootstrap approach A real life example: LOFAR simulations

More information

Chapter 2: Simple Random Sampling and a Brief Review of Probability

Chapter 2: Simple Random Sampling and a Brief Review of Probability Chapter 2: Simple Random Sampling and a Brief Review of Probability Forest Before the Trees Chapters 2-6 primarily investigate survey analysis. We begin with the basic analyses: Those that differ according

More information