Unequal Probability Designs

Size: px
Start display at page:

Download "Unequal Probability Designs"

Transcription

1 Unequal Probability Designs Department of Statistics University of British Columbia This is prepares for Stat 344, 2014

2 Section 7.11 and 7.12

3 Probability Sampling Designs: A quick review A probability sampling design defines a random mechanism to decide which subset of a finite population is included in the sample. In comparison, a representative sample can be taken by the judgement of the samplers: these units are typical of the population. When appropriately used, a probability sampling plan avoids human bias, allows us to give an error assessment of the estimators.

4 Is a sampling plan biased? Suppose I want to get an idea on how well my students are prepared for the final exam. Let me randomly get three students in the class and check their readiness. This sample is biased because less prepared students likely miss more classes. More precisely, it is because the inclusions probability of individuals are unequal.

5 Is a sampling plan biased? Suppose I want to get an idea on how well my students are prepared for the final exam. Let me randomly get three students in the class and check their readiness. This sample is biased because less prepared students likely miss more classes. More precisely, it is because the inclusions probability of individuals are unequal.

6 Is a bias sampling plan evil? Often, a textbook will instruct student that because the plan is biased, the conclusion will be biased. I must declare this is only partly true. The following statement is more accurate. The conclusion will be biased if the statistician fails to quantify the effect of the bias in the sampling and accommodates it into analysis. We often contradict ourselves unknowingly.

7 Is a bias sampling plan evil? Often, a textbook will instruct student that because the plan is biased, the conclusion will be biased. I must declare this is only partly true. The following statement is more accurate. The conclusion will be biased if the statistician fails to quantify the effect of the bias in the sampling and accommodates it into analysis. We often contradict ourselves unknowingly.

8 Is a bias sampling plan evil? Often, a textbook will instruct student that because the plan is biased, the conclusion will be biased. I must declare this is only partly true. The following statement is more accurate. The conclusion will be biased if the statistician fails to quantify the effect of the bias in the sampling and accommodates it into analysis. We often contradict ourselves unknowingly.

9 Examples of biased and unbiased sampling plans SRSWOR is an unbiased sampling plan: all units in the population have equal probability to be included. The systematic sampling plan is unbiased. Cluster sampling plan on populations with equal cluster sizes is unbiased. (some details will be given).

10 Examples of biased and unbiased sampling plans Most artificial probability sampling plans we used for illustration purpose are biased. Cluster sampling plan on populations with unequal cluster sizes is biased. (some details must be added). Stratified SRSWOR is usually biased unless the proportional allocation is used.

11 Examples of biased and unbiased sampling plans Most artificial probability sampling plans we used for illustration purpose are biased. Cluster sampling plan on populations with unequal cluster sizes is biased. (some details must be added). Stratified SRSWOR is usually biased unless the proportional allocation is used.

12 Examples of biased and unbiased sampling plans Most artificial probability sampling plans we used for illustration purpose are biased. Cluster sampling plan on populations with unequal cluster sizes is biased. (some details must be added). Stratified SRSWOR is usually biased unless the proportional allocation is used.

13 What do we mean by biased A probability sampling plan is biased when the inclusion probabilities π i are not all equal for i = 1, 2,..., N. do you still remember π i? do you remember π i,j? Even if a plan (sampling design) is unbiased, the joint inclusion probabilities π i,j are not required to be equal.

14 What do we mean by biased A probability sampling plan is biased when the inclusion probabilities π i are not all equal for i = 1, 2,..., N. do you still remember π i? do you remember π i,j? Even if a plan (sampling design) is unbiased, the joint inclusion probabilities π i,j are not required to be equal.

15 What do we mean by biased A probability sampling plan is biased when the inclusion probabilities π i are not all equal for i = 1, 2,..., N. do you still remember π i? do you remember π i,j? Even if a plan (sampling design) is unbiased, the joint inclusion probabilities π i,j are not required to be equal.

16 Stratified SRSWOR as an unequal probability plan Suppose a finite population is stratified with stratum sizes N 1, N 2,..., N G. An SRSWOR of size n g is taken from the gth stratum (independently in different strata). For a unit in the gth stratum, its inclusion probability is π g,i = n g /N g. Unless n 1 /N 1 = n 2 /N 2 = = n G /N G, this plan is biased.

17 Some properties of the unequal probability sampling plan Suppose we somehow have managed a probability sampling design (without replacement) with unequal probability of selection (unequal π i ). Consider the plan where the sample size n is non-random.

18 Let δ i be an indicator about whether unit i is in the sample or not. It is seen that this δ i is random and N δ i = n i=1 because the sum on the left is a count of now many sampling units are in the sample. Taking expectation on two sides we get N π i = n i=1 That is, the total inclusion probabilities across all sampling units in the population is fixed at n.

19 What changes are needed to N π i = n i=1 when n is random in a sampling plan?

20 Similarly with obvious notation, we should have N δ ij = n(n 1) i=1 j:j i which leads to N π ij = n(n 1). i=1 j:j i Try to show other identities in (7.33) of the textbook for yourselves.

21 Horvitz-Thompson Estimator (7.12) Let us change our general setting slightly. We consider the situation where a finite population has N sampling units (listed in a sampling frame) A probability sampling plan has been carried out, where the sample size n is allowed to be random. The inclusion probability of unit i under this plan remains denoted as π i.

22 Recall that we denote sampling weight of a sampling unit as w i = 1/π i. It is the number of units in the population this unit is representing. For this view, the population total Y is sensibly estimated by Ŷ HT = n w i y i = i=1 n i=1 y i π i. The above estimator is named as Horvitz-Thompson estimator.

23 Bias of HT-estimator Let δ i = 1 only if the ith unit in the population is included in the sample. The HT-estimator can also be written as n w i y i = i=1 N δ i w i y i. i=1 What is random in this estimator? δ i is random. How should we compute E{δ i } and what it equals? E(δ i ) = π i = 1/w i.

24 Bias of HT-estimator Let δ i = 1 only if the ith unit in the population is included in the sample. The HT-estimator can also be written as n w i y i = i=1 N δ i w i y i. i=1 What is random in this estimator? δ i is random. How should we compute E{δ i } and what it equals? E(δ i ) = π i = 1/w i.

25 Bias of HT-estimator Let δ i = 1 only if the ith unit in the population is included in the sample. The HT-estimator can also be written as n w i y i = i=1 N δ i w i y i. i=1 What is random in this estimator? δ i is random. How should we compute E{δ i } and what it equals? E(δ i ) = π i = 1/w i.

26 Bias of HT-estimator Let δ i = 1 only if the ith unit in the population is included in the sample. The HT-estimator can also be written as n w i y i = i=1 N δ i w i y i. i=1 What is random in this estimator? δ i is random. How should we compute E{δ i } and what it equals? E(δ i ) = π i = 1/w i.

27 Bias of the HT-estimator It is therefore seen n E( w i y i ) = i=1 N E(δ i )w i y i = Y. i=1 What have we assumed here? π i > 0 for all i in the population. When π i > 0 for all i, HT-estimator is unbiased for the population total Y.

28 Bias of the HT-estimator It is therefore seen n E( w i y i ) = i=1 N E(δ i )w i y i = Y. i=1 What have we assumed here? π i > 0 for all i in the population. When π i > 0 for all i, HT-estimator is unbiased for the population total Y.

29 Bias of the HT-estimator It is therefore seen n E( w i y i ) = i=1 N E(δ i )w i y i = Y. i=1 What have we assumed here? π i > 0 for all i in the population. When π i > 0 for all i, HT-estimator is unbiased for the population total Y.

30 HT-estimator under stratified SRSWOR Under stratified SRSWOR, π g,i = n g /N g, hence w i = N g /n g. Therefore, the HT-estimator is Ŷ HT = g n g (N g /n g )y gi = Nȳ st. i=1 That is, if translated to estimator of Ȳ, HT-estimator is simply the stratified sample mean under this design. The moral is: we often find the trace of a more advanced method in commonly used methods.

31 HT-estimator under stratified SRSWOR Under stratified SRSWOR, π g,i = n g /N g, hence w i = N g /n g. Therefore, the HT-estimator is Ŷ HT = g n g (N g /n g )y gi = Nȳ st. i=1 That is, if translated to estimator of Ȳ, HT-estimator is simply the stratified sample mean under this design. The moral is: we often find the trace of a more advanced method in commonly used methods.

32 Variance of the HT-estimator We clearly have Var( n i=1 w i y i ) = i,j Cov(δ i, δ j )w i w j y i y j Do not be scared by this complex summation. There are two cases for Cov(, ): When i = j, we have Cov(δ i, δ j ) = π i (1 π i ); When i j, we have Cov(δ i, δ j ) = π ij π i π j Note that π ii = π i, so Cov(δ i, δ j ) = π ij π i π j is right in both cases.

33 Variance of the HT-estimator Sub Cov(δ i, δ j ) = π ij π i π j into the following, Var( n i=1 w i y i ) = i,j Cov(δ i, δ j )w i w j y i y j, we get (7.35) Var( n i=1 w i y i ) = i,j (π ij π i π j )(y i /π i )(y j /π j ).

34 We may equivalently have it written as n Var( w i y i ) = (π i π j π ij )(y i /π i y j /π j ) 2. i=1 i<j Let us do it in class with your help.

35 We may equivalently have it written as n Var( w i y i ) = (π i π j π ij )(y i /π i y j /π j ) 2. i=1 i<j Let us do it in class with your help.

36 How to estimate the above variance? There are two suggestions. One is to estimate it as v 1 (ŶHT ) = n i=1 1 π i πi 2 yi 2 + i j π ij π i π j y i y j π i π j π ij

37 The other is by v 2 (Ŷ HT ) = 1 i<j n Both are unbiased estimators of Var(ŶHT ). (π i π j π ij ) { y i y j } 2. π ij π i π j Both are mathematically imperfect as they may take negative values. If you work hard enough, you will find such imperfectness is not a problem under stratified srswor.

38 Variance estimators of HT-estimate under stratified srswor Using conventional notation, and under stratified srswor, we have π 1 ij (π i π j π ij ) = 0 when two units are in two different strata. For two units in the same stratum g, π 1 ij (π i π j π ij ) = (N g n g )/(n g 1) = (1 f g )/(n g 1). Let us use them in (7.36).

39 Variance estimators of HT-estimate under stratified srswor Recall (7.36) v( n i=1 w i y i ) = i<j π 1 ij (π i π j π ij )(y i /π i y j /π j ) 2 Hence, under stratified srswor, we have n n G g v( w i y i ) = {(1 f g )/(n g 1)}(Ng 2 /ng 2 ) (y gi y gj ) 2 i=1 g=1 i<j = N 2 Wg 2 (1 f g )sg 2 /n g. g I have abused notation quite badly.

40 Variance estimators of HT-estimate under stratified srswor You should notice that v( n i=1 w i y i ) = N 2 g W 2 g (1 f g )s 2 g /n g = N 2 v(ȳ st ). I have skipped many details. One will not get full mark in final exam until these details are included and explained.

41 Variance estimators of HT-estimate under stratified srswor You should notice that v( n i=1 w i y i ) = N 2 g W 2 g (1 f g )s 2 g /n g = N 2 v(ȳ st ). I have skipped many details. One will not get full mark in final exam until these details are included and explained.

42 Estimate the population mean If ŶHT is a good estimator of the population total, it makes sense to estimate the population mean by ˆȲ HT = Ŷ HT /N. We could estimate the mean by Ȳ HT = ŶHT n i=1 w i instead.

43 The second estimator makes sense when N is not known as it could happen in cluster sampling, particularly when the cluster sizes are not equal. It also works better when n i=1 w i differs a lot from N.

44 PPS design motived from variance of HT estimator Staring at the following formula a bit longer n Var( (π i π j π ij )(y i /π i y j /π j ) 2 i=1 w i y i ) = i<j we may conclude that if y i /π i = c for all units in the population, then n Var(Ŷ HT ) = Var( w i y i ) = 0. i=1 This is wonderful except the idea is not feasible in applications: we have to know all y i values to design such a plan.

45 PPS design motived from variance of HT estimator Staring at the following formula a bit longer n Var( (π i π j π ij )(y i /π i y j /π j ) 2 i=1 w i y i ) = i<j we may conclude that if y i /π i = c for all units in the population, then n Var(Ŷ HT ) = Var( w i y i ) = 0. i=1 This is wonderful except the idea is not feasible in applications: we have to know all y i values to design such a plan.

46 PPS design motived from variance of HT estimator Staring at the following formula a bit longer n Var( (π i π j π ij )(y i /π i y j /π j ) 2 i=1 w i y i ) = i<j we may conclude that if y i /π i = c for all units in the population, then n Var(Ŷ HT ) = Var( w i y i ) = 0. i=1 This is wonderful except the idea is not feasible in applications: we have to know all y i values to design such a plan.

47 PPS design motived from variance of HT estimator Staring at the following formula a bit longer n Var( (π i π j π ij )(y i /π i y j /π j ) 2 i=1 w i y i ) = i<j we may conclude that if y i /π i = c for all units in the population, then n Var(Ŷ HT ) = Var( w i y i ) = 0. i=1 This is wonderful except the idea is not feasible in applications: we have to know all y i values to design such a plan.

48 PPS based on auxiliary information Knowing all y i values is not realistic. Yet consider the example when the finite population is all farms in Canada. The acreages (z) of these farms are often known while we might be interested in their total produce of corns (y). Create a design such that π i z i in this case is at least conceptually possible. Since z i is approximately proportional to y i, this design is likely efficient.

49 PPS based on auxiliary information Knowing all y i values is not realistic. Yet consider the example when the finite population is all farms in Canada. The acreages (z) of these farms are often known while we might be interested in their total produce of corns (y). Create a design such that π i z i in this case is at least conceptually possible. Since z i is approximately proportional to y i, this design is likely efficient.

50 PPS based on auxiliary information Knowing all y i values is not realistic. Yet consider the example when the finite population is all farms in Canada. The acreages (z) of these farms are often known while we might be interested in their total produce of corns (y). Create a design such that π i z i in this case is at least conceptually possible. Since z i is approximately proportional to y i, this design is likely efficient.

51 PPS based on auxiliary information Knowing all y i values is not realistic. Yet consider the example when the finite population is all farms in Canada. The acreages (z) of these farms are often known while we might be interested in their total produce of corns (y). Create a design such that π i z i in this case is at least conceptually possible. Since z i is approximately proportional to y i, this design is likely efficient.

52 PPS based on auxiliary information Knowing all y i values is not realistic. Yet consider the example when the finite population is all farms in Canada. The acreages (z) of these farms are often known while we might be interested in their total produce of corns (y). Create a design such that π i z i in this case is at least conceptually possible. Since z i is approximately proportional to y i, this design is likely efficient.

53 PPS design Suppose a size information z i is known for all units in the finite population. A probability sampling design with the property of π i z i is called a PPS design. A PPS design is often referred as an optimal design. I do not like an unqualified claim of being optimal.

54 PPS design Suppose a size information z i is known for all units in the finite population. A probability sampling design with the property of π i z i is called a PPS design. A PPS design is often referred as an optimal design. I do not like an unqualified claim of being optimal.

55 PPS design: final hurdle Let us now try to get design an optimal PPS plan such that π i z i. How do we do it? Suppose z i = i for i = 1, 2,..., 9 and n = 6.

56 Some math behind π i Recall the definition of δ i and that N δ i = n. i=1 Taking expectation on both sizes, we get π i = n. i Requiring π i i together with n = 6, we end up π 9 = 6/5.

57 Conclusion from the previous calculation: Because the largest possible inclusion probability is 1, PPS plan is not always feasible. In general, it is not simple to design a sampling plan such that π i z i for pre-specified z i.

58 One more slide Much of the discussion about unequal probability sampling plan are conceptual. Try to understand the concepts, ignore more or less these formulas.

59 Implementation of a probability sampling plan Most toy examples of probability sampling plan can be easily implemented using cards, dices and so on. SRSWOR can be easily implemented: make N cards representing N units in a finite population. shuffle them thoroughly. take the units represented by the first n cards.

60 Implementation of a probability sampling plan Most toy examples of probability sampling plan can be easily implemented using cards, dices and so on. SRSWOR can be easily implemented: make N cards representing N units in a finite population. shuffle them thoroughly. take the units represented by the first n cards.

61 SRSWOR If N = 10, 000, making 10,000 cards is not practical. However, there are computer software which can generate pseudo random numbers for a large enough N. To our naked eyes, these outcomes are random enough. Even if you do not trust computer software, there are ways to generate very authentic random numbers (such as in lottery).

62 Systematic plan This one is the easiest in any realistic applications. Suppose we wish to sample every 100th unit in a population. Create 100 cards, shuffle them thoroughly and pick one. Using this number as the first unit to be sampled.

63 Stratified SRSWOR The level of difficulty to implement a stratified SRSWOR is the same as the SRSWOR. We simply carry out SRSWOR stratum by stratum. Stratum sample sizes are design issues, not implementation issues.

64 Poisson Sampling Plan It is generally too complex to create a sampling plan with pre-specified inclusion probabilities (π i ). The problem is even harder if one wants specific pairwise joint inclusion probabilities (π ij ). We usually create a plan use some common sense, and end up with whatever π i and π ij. Poisson sampling plan is one which allows us to control π i in a compromised way.

65 Poisson Sampling Plan Suppose we have pre-specified inclusion probabilities (π i ) for every sampling unit in the finite population. We then toss N coins such as the ith coin has probability π i to show its face. We then sample all units with their corresponding coins showed faces. Physically making and tossing N coins are not sensible. Yet we can cheat with a computer software.

66 Poisson Sampling Plan Suppose we have pre-specified inclusion probabilities (π i ) for every sampling unit in the finite population. We then toss N coins such as the ith coin has probability π i to show its face. We then sample all units with their corresponding coins showed faces. Physically making and tossing N coins are not sensible. Yet we can cheat with a computer software.

67 Undesirable properties of the Poisson Sampling Plan This plan does not have a pre-specified sample size.

68 Design with Arbitrary pre-specified unequal probability plan There are many mathematically elegant solutions; None of them are simple enough to be discussed here; In applications, we do not really like them even if implementable.

69 Rao-Hartley-Cochran design Rao-Hartley-Cochran design is the only that is actually used in applications with nice statistical efficiency. It does not achieve pre-specified inclusion probabilities. Yet it achieves high efficiency in an elegant way.

70 Rao-Hartley-Cochran design Suppose we have a surrogate (auxiliary) size variable z i > 0 for all units in the finite population. It is desirable to have inclusion probability positively correlated to z i. Let the population size be N and the pre-specified sample size is n.

71 Rao-Hartley-Cochran design Suppose we have a surrogate (auxiliary) size variable z i > 0 for all units in the finite population. It is desirable to have inclusion probability π i positively correlated to z i. Let the population size be N and the pre-specified sample size is n.

72 Rao-Hartley-Cochran design We divide the sampling units in the finite population into n groups of pre-specified sizes N 1, N 2,..., N n evenly. Let Z g = i g z i for g = 1, 2,..., n. Here i g means all units in the gth group. Select one unit from the gth group with probability p j = z j /Z g. With one unit from each of n group, we get n units in the sample. Note that the sample size is not random.

73 Rao-Hartley-Cochran design We divide the sampling units in the finite population into n groups of pre-specified sizes N 1, N 2,..., N n evenly. Let Z g = i g z i for g = 1, 2,..., n. Here i g means all units in the gth group. Select one unit from the gth group with probability p j = z j /Z g. With one unit from each of n group, we get n units in the sample. Note that the sample size is not random.

74 Inclusion probability of the RHC-design If these n groups are not formed randomly, we would have π j = z j /Z g. Given the outcome of the random grouping, the conditional inclusion probability is p j. Unconditionally, we have π j = E(p j ), yet there is no simple algebraic expression for this inclusion probability.

75 Estimating Y under the RHC-design In the spirit of HT estimator, RHC recommend estimating the population total by n Ŷ RHC = y g /p g. g=1 When N g = N/n is an integer, its variance is given by Var(Ŷ RHC ) = and it is well estimated by v(ŷ RHC ) = N n (N 1)n N n n(n 1) n g=1 N i=1 z i ( yi z i Y ) 2 ( ) 2 yg Z g Ŷ RHC. z g I will provide no math here. Numerical illustration will be given when time permits.

76 Estimating Y under the RHC-design In the spirit of HT estimator, RHC recommend estimating the population total by n Ŷ RHC = y g /p g. g=1 When N g = N/n is an integer, its variance is given by Var(Ŷ RHC ) = and it is well estimated by v(ŷ RHC ) = N n (N 1)n N n n(n 1) n g=1 N i=1 z i ( yi z i Y ) 2 ( ) 2 yg Z g Ŷ RHC. z g I will provide no math here. Numerical illustration will be given when time permits.

77 Complex designs While very complex designs are used in real world, they are usually assembled by using simple designs. The population may be first divided into strata, each strata is made of clusters of unequal size. A cluster itself may have some structure. A complex design may decide to use a stratified plan on the top, a systematic plan for clusters, an unequal probability plan within each clusters selected. Such designs are called multi-stage designs.

78 Complex designs No matter how complex a design might be, they are made of simple ones. No matter how complex a building may appear, it materially made of bricks, glasses, steels; and structurally made of simple triangles, rectangles or at most some curves.

79 Complex designs No matter how complex a design might be, they are made of simple ones. No matter how complex a building may appear, it materially made of bricks, glasses, steels; and structurally made of simple triangles, rectangles or at most some curves.

80 What notions you should retain? Be able to give a clear description of how to implement SRSWOR Poisson plan, Systematic plan, RHC plan, Stratified x-plan.

of being selected and varying such probability across strata under optimal allocation leads to increased accuracy.

of being selected and varying such probability across strata under optimal allocation leads to increased accuracy. 5 Sampling with Unequal Probabilities Simple random sampling and systematic sampling are schemes where every unit in the population has the same chance of being selected We will now consider unequal probability

More information

Chapter 3: Element sampling design: Part 1

Chapter 3: Element sampling design: Part 1 Chapter 3: Element sampling design: Part 1 Jae-Kwang Kim Fall, 2014 Simple random sampling 1 Simple random sampling 2 SRS with replacement 3 Systematic sampling Kim Ch. 3: Element sampling design: Part

More information

The New sampling Procedure for Unequal Probability Sampling of Sample Size 2.

The New sampling Procedure for Unequal Probability Sampling of Sample Size 2. . The New sampling Procedure for Unequal Probability Sampling of Sample Size. Introduction :- It is a well known fact that in simple random sampling, the probability selecting the unit at any given draw

More information

Empirical Likelihood Methods for Sample Survey Data: An Overview

Empirical Likelihood Methods for Sample Survey Data: An Overview AUSTRIAN JOURNAL OF STATISTICS Volume 35 (2006), Number 2&3, 191 196 Empirical Likelihood Methods for Sample Survey Data: An Overview J. N. K. Rao Carleton University, Ottawa, Canada Abstract: The use

More information

Cluster Sampling 2. Chapter Introduction

Cluster Sampling 2. Chapter Introduction Chapter 7 Cluster Sampling 7.1 Introduction In this chapter, we consider two-stage cluster sampling where the sample clusters are selected in the first stage and the sample elements are selected in the

More information

POPULATION AND SAMPLE

POPULATION AND SAMPLE 1 POPULATION AND SAMPLE Population. A population refers to any collection of specified group of human beings or of non-human entities such as objects, educational institutions, time units, geographical

More information

Sampling and Estimation in Agricultural Surveys

Sampling and Estimation in Agricultural Surveys GS Training and Outreach Workshop on Agricultural Surveys Training Seminar: Sampling and Estimation in Cristiano Ferraz 24 October 2016 Download a free copy of the Handbook at: http://gsars.org/wp-content/uploads/2016/02/msf-010216-web.pdf

More information

On Efficiency of Midzuno-Sen Strategy under Two-phase Sampling

On Efficiency of Midzuno-Sen Strategy under Two-phase Sampling International Journal of Statistics and Analysis. ISSN 2248-9959 Volume 7, Number 1 (2017), pp. 19-26 Research India Publications http://www.ripublication.com On Efficiency of Midzuno-Sen Strategy under

More information

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 CS 70 Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 14 Introduction One of the key properties of coin flips is independence: if you flip a fair coin ten times and get ten

More information

Sampling Distributions

Sampling Distributions Sampling Error As you may remember from the first lecture, samples provide incomplete information about the population In particular, a statistic (e.g., M, s) computed on any particular sample drawn from

More information

REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY

REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY J.D. Opsomer, W.A. Fuller and X. Li Iowa State University, Ames, IA 50011, USA 1. Introduction Replication methods are often used in

More information

Nonresponse weighting adjustment using estimated response probability

Nonresponse weighting adjustment using estimated response probability Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy

More information

Sampling in Space and Time. Natural experiment? Analytical Surveys

Sampling in Space and Time. Natural experiment? Analytical Surveys Sampling in Space and Time Overview of Sampling Approaches Sampling versus Experimental Design Experiments deliberately perturb a portion of population to determine effect objective is to compare the mean

More information

Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities

Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Peter M. Aronow and Cyrus Samii Forthcoming at Survey Methodology Abstract We consider conservative variance

More information

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 16. Random Variables: Distribution and Expectation

Discrete Mathematics and Probability Theory Spring 2016 Rao and Walrand Note 16. Random Variables: Distribution and Expectation CS 70 Discrete Mathematics and Probability Theory Spring 206 Rao and Walrand Note 6 Random Variables: Distribution and Expectation Example: Coin Flips Recall our setup of a probabilistic experiment as

More information

A decision theoretic approach to Imputation in finite population sampling

A decision theoretic approach to Imputation in finite population sampling A decision theoretic approach to Imputation in finite population sampling Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 August 1997 Revised May and November 1999 To appear

More information

Lecture 6: The Pigeonhole Principle and Probability Spaces

Lecture 6: The Pigeonhole Principle and Probability Spaces Lecture 6: The Pigeonhole Principle and Probability Spaces Anup Rao January 17, 2018 We discuss the pigeonhole principle and probability spaces. Pigeonhole Principle The pigeonhole principle is an extremely

More information

Stat472/572 Sampling: Theory and Practice Instructor: Yan Lu

Stat472/572 Sampling: Theory and Practice Instructor: Yan Lu Stat472/572 Sampling: Theory and Practice Instructor: Yan Lu 1 Chapter 5 Cluster Sampling with Equal Probability Example: Sampling students in high school. Take a random sample of n classes (The classes

More information

Sampling: A Brief Review. Workshop on Respondent-driven Sampling Analyst Software

Sampling: A Brief Review. Workshop on Respondent-driven Sampling Analyst Software Sampling: A Brief Review Workshop on Respondent-driven Sampling Analyst Software 201 1 Purpose To review some of the influences on estimates in design-based inference in classic survey sampling methods

More information

SAMPLING II BIOS 662. Michael G. Hudgens, Ph.D. mhudgens :37. BIOS Sampling II

SAMPLING II BIOS 662. Michael G. Hudgens, Ph.D.  mhudgens :37. BIOS Sampling II SAMPLING II BIOS 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2008-11-17 14:37 BIOS 662 1 Sampling II Outline Stratified sampling Introduction Notation and Estimands

More information

C. J. Skinner Cross-classified sampling: some estimation theory

C. J. Skinner Cross-classified sampling: some estimation theory C. J. Skinner Cross-classified sampling: some estimation theory Article (Accepted version) (Refereed) Original citation: Skinner, C. J. (205) Cross-classified sampling: some estimation theory. Statistics

More information

Compatible probability measures

Compatible probability measures Coin tossing space Think of a coin toss as a random choice from the two element set {0,1}. Thus the set {0,1} n represents the set of possible outcomes of n coin tosses, and Ω := {0,1} N, consisting of

More information

Sampling Concepts. IUFRO-SPDC Snowbird, UT September 29 Oct 3, 2014 Drs. Rolfe Leary and John A. Kershaw, Jr.

Sampling Concepts. IUFRO-SPDC Snowbird, UT September 29 Oct 3, 2014 Drs. Rolfe Leary and John A. Kershaw, Jr. Sampling Concepts IUFRO-SPDC Snowbird, UT September 9 Oct 3, 014 Drs. Rolfe Leary and John A. Kershaw, Jr. Sampling Concepts Simple Sampling Strategies: Random Sampling Systematic Sampling Stratified Sampling

More information

L6: Regression II. JJ Chen. July 2, 2015

L6: Regression II. JJ Chen. July 2, 2015 L6: Regression II JJ Chen July 2, 2015 Today s Plan Review basic inference based on Sample average Difference in sample average Extrapolate the knowledge to sample regression coefficients Standard error,

More information

Asymptotic Normality under Two-Phase Sampling Designs

Asymptotic Normality under Two-Phase Sampling Designs Asymptotic Normality under Two-Phase Sampling Designs Jiahua Chen and J. N. K. Rao University of Waterloo and University of Carleton Abstract Large sample properties of statistical inferences in the context

More information

MATH2206 Prob Stat/20.Jan Weekly Review 1-2

MATH2206 Prob Stat/20.Jan Weekly Review 1-2 MATH2206 Prob Stat/20.Jan.2017 Weekly Review 1-2 This week I explained the idea behind the formula of the well-known statistic standard deviation so that it is clear now why it is a measure of dispersion

More information

Discrete Mathematics for CS Spring 2006 Vazirani Lecture 22

Discrete Mathematics for CS Spring 2006 Vazirani Lecture 22 CS 70 Discrete Mathematics for CS Spring 2006 Vazirani Lecture 22 Random Variables and Expectation Question: The homeworks of 20 students are collected in, randomly shuffled and returned to the students.

More information

P (E) = P (A 1 )P (A 2 )... P (A n ).

P (E) = P (A 1 )P (A 2 )... P (A n ). Lecture 9: Conditional probability II: breaking complex events into smaller events, methods to solve probability problems, Bayes rule, law of total probability, Bayes theorem Discrete Structures II (Summer

More information

Introduction to Survey Data Analysis

Introduction to Survey Data Analysis Introduction to Survey Data Analysis JULY 2011 Afsaneh Yazdani Preface Learning from Data Four-step process by which we can learn from data: 1. Defining the Problem 2. Collecting the Data 3. Summarizing

More information

STA304H1F/1003HF Summer 2015: Lecture 11

STA304H1F/1003HF Summer 2015: Lecture 11 STA304H1F/1003HF Summer 2015: Lecture 11 You should know... What is one-stage vs two-stage cluster sampling? What are primary and secondary sampling units? What are the two types of estimation in cluster

More information

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 12. Random Variables: Distribution and Expectation

Discrete Mathematics and Probability Theory Fall 2013 Vazirani Note 12. Random Variables: Distribution and Expectation CS 70 Discrete Mathematics and Probability Theory Fall 203 Vazirani Note 2 Random Variables: Distribution and Expectation We will now return once again to the question of how many heads in a typical sequence

More information

SAMPLING III BIOS 662

SAMPLING III BIOS 662 SAMPLIG III BIOS 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2009-08-11 09:52 BIOS 662 1 Sampling III Outline One-stage cluster sampling Systematic sampling Multi-stage

More information

REMAINDER LINEAR SYSTEMATIC SAMPLING

REMAINDER LINEAR SYSTEMATIC SAMPLING Sankhyā : The Indian Journal of Statistics 2000, Volume 62, Series B, Pt. 2, pp. 249 256 REMAINDER LINEAR SYSTEMATIC SAMPLING By HORNG-JINH CHANG and KUO-CHUNG HUANG Tamkang University, Taipei SUMMARY.

More information

Now we will define some common sampling plans and discuss their strengths and limitations.

Now we will define some common sampling plans and discuss their strengths and limitations. Now we will define some common sampling plans and discuss their strengths and limitations. 1 For volunteer samples individuals are self selected. Participants decide to include themselves in the study.

More information

SAMPLING BIOS 662. Michael G. Hudgens, Ph.D. mhudgens :55. BIOS Sampling

SAMPLING BIOS 662. Michael G. Hudgens, Ph.D.   mhudgens :55. BIOS Sampling SAMPLIG BIOS 662 Michael G. Hudgens, Ph.D. mhudgens@bios.unc.edu http://www.bios.unc.edu/ mhudgens 2008-11-14 15:55 BIOS 662 1 Sampling Outline Preliminaries Simple random sampling Population mean Population

More information

Conditional distributions (discrete case)

Conditional distributions (discrete case) Conditional distributions (discrete case) The basic idea behind conditional distributions is simple: Suppose (XY) is a jointly-distributed random vector with a discrete joint distribution. Then we can

More information

MITOCW watch?v=vjzv6wjttnc

MITOCW watch?v=vjzv6wjttnc MITOCW watch?v=vjzv6wjttnc PROFESSOR: We just saw some random variables come up in the bigger number game. And we're going to be talking now about random variables, just formally what they are and their

More information

Main sampling techniques

Main sampling techniques Main sampling techniques ELSTAT Training Course January 23-24 2017 Martin Chevalier Department of Statistical Methods Insee 1 / 187 Main sampling techniques Outline Sampling theory Simple random sampling

More information

Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A.

Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A. Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A. Keywords: Survey sampling, finite populations, simple random sampling, systematic

More information

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019

Lecture 10: Probability distributions TUESDAY, FEBRUARY 19, 2019 Lecture 10: Probability distributions DANIEL WELLER TUESDAY, FEBRUARY 19, 2019 Agenda What is probability? (again) Describing probabilities (distributions) Understanding probabilities (expectation) Partial

More information

A noninformative Bayesian approach to domain estimation

A noninformative Bayesian approach to domain estimation A noninformative Bayesian approach to domain estimation Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 glen@stat.umn.edu August 2002 Revised July 2003 To appear in Journal

More information

Chapter 4: An Introduction to Probability and Statistics

Chapter 4: An Introduction to Probability and Statistics Chapter 4: An Introduction to Probability and Statistics 4. Probability The simplest kinds of probabilities to understand are reflected in everyday ideas like these: (i) if you toss a coin, the probability

More information

Lecture 8: Conditional probability I: definition, independence, the tree method, sampling, chain rule for independent events

Lecture 8: Conditional probability I: definition, independence, the tree method, sampling, chain rule for independent events Lecture 8: Conditional probability I: definition, independence, the tree method, sampling, chain rule for independent events Discrete Structures II (Summer 2018) Rutgers University Instructor: Abhishek

More information

Estimation of change in a rotation panel design

Estimation of change in a rotation panel design Int. Statistical Inst.: Proc. 58th World Statistical Congress, 2011, Dublin (Session CPS028) p.4520 Estimation of change in a rotation panel design Andersson, Claes Statistics Sweden S-701 89 Örebro, Sweden

More information

STAT 516 Midterm Exam 3 Friday, April 18, 2008

STAT 516 Midterm Exam 3 Friday, April 18, 2008 STAT 56 Midterm Exam 3 Friday, April 8, 2008 Name Purdue student ID (0 digits). The testing booklet contains 8 questions. 2. Permitted Texas Instruments calculators: BA-35 BA II Plus BA II Plus Professional

More information

ACTEX CAS EXAM 3 STUDY GUIDE FOR MATHEMATICAL STATISTICS

ACTEX CAS EXAM 3 STUDY GUIDE FOR MATHEMATICAL STATISTICS ACTEX CAS EXAM 3 STUDY GUIDE FOR MATHEMATICAL STATISTICS TABLE OF CONTENTS INTRODUCTORY NOTE NOTES AND PROBLEM SETS Section 1 - Point Estimation 1 Problem Set 1 15 Section 2 - Confidence Intervals and

More information

Stochastic calculus for summable processes 1

Stochastic calculus for summable processes 1 Stochastic calculus for summable processes 1 Lecture I Definition 1. Statistics is the science of collecting, organizing, summarizing and analyzing the information in order to draw conclusions. It is a

More information

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations

Discrete Mathematics and Probability Theory Fall 2014 Anant Sahai Note 15. Random Variables: Distributions, Independence, and Expectations EECS 70 Discrete Mathematics and Probability Theory Fall 204 Anant Sahai Note 5 Random Variables: Distributions, Independence, and Expectations In the last note, we saw how useful it is to have a way of

More information

RESEARCH REPORT. Vanishing auxiliary variables in PPS sampling with applications in microscopy.

RESEARCH REPORT. Vanishing auxiliary variables in PPS sampling with applications in microscopy. CENTRE FOR STOCHASTIC GEOMETRY AND ADVANCED BIOIMAGING 2014 www.csgb.dk RESEARCH REPORT Ina Trolle Andersen, Ute Hahn and Eva B. Vedel Jensen Vanishing auxiliary variables in PPS sampling with applications

More information

Empirical and Constrained Empirical Bayes Variance Estimation Under A One Unit Per Stratum Sample Design

Empirical and Constrained Empirical Bayes Variance Estimation Under A One Unit Per Stratum Sample Design Empirical and Constrained Empirical Bayes Variance Estimation Under A One Unit Per Stratum Sample Design Sepideh Mosaferi Abstract A single primary sampling unit (PSU) per stratum design is a popular design

More information

An Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys

An Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys An Overview of the Pros and Cons of Linearization versus Replication in Establishment Surveys Richard Valliant University of Michigan and Joint Program in Survey Methodology University of Maryland 1 Introduction

More information

Lecture 4: Two-point Sampling, Coupon Collector s problem

Lecture 4: Two-point Sampling, Coupon Collector s problem Randomized Algorithms Lecture 4: Two-point Sampling, Coupon Collector s problem Sotiris Nikoletseas Associate Professor CEID - ETY Course 2013-2014 Sotiris Nikoletseas, Associate Professor Randomized Algorithms

More information

1. When applied to an affected person, the test comes up positive in 90% of cases, and negative in 10% (these are called false negatives ).

1. When applied to an affected person, the test comes up positive in 90% of cases, and negative in 10% (these are called false negatives ). CS 70 Discrete Mathematics for CS Spring 2006 Vazirani Lecture 8 Conditional Probability A pharmaceutical company is marketing a new test for a certain medical condition. According to clinical trials,

More information

Chapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70

Chapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Chapter 5: Models used in conjunction with sampling J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Nonresponse Unit Nonresponse: weight adjustment Item Nonresponse:

More information

Chapter 2: Simple Random Sampling and a Brief Review of Probability

Chapter 2: Simple Random Sampling and a Brief Review of Probability Chapter 2: Simple Random Sampling and a Brief Review of Probability Forest Before the Trees Chapters 2-6 primarily investigate survey analysis. We begin with the basic analyses: Those that differ according

More information

review session gov 2000 gov 2000 () review session 1 / 38

review session gov 2000 gov 2000 () review session 1 / 38 review session gov 2000 gov 2000 () review session 1 / 38 Overview Random Variables and Probability Univariate Statistics Bivariate Statistics Multivariate Statistics Causal Inference gov 2000 () review

More information

V. Properties of estimators {Parts C, D & E in this file}

V. Properties of estimators {Parts C, D & E in this file} A. Definitions & Desiderata. model. estimator V. Properties of estimators {Parts C, D & E in this file}. sampling errors and sampling distribution 4. unbiasedness 5. low sampling variance 6. low mean squared

More information

Senior Math Circles November 19, 2008 Probability II

Senior Math Circles November 19, 2008 Probability II University of Waterloo Faculty of Mathematics Centre for Education in Mathematics and Computing Senior Math Circles November 9, 2008 Probability II Probability Counting There are many situations where

More information

k P (X = k)

k P (X = k) Math 224 Spring 208 Homework Drew Armstrong. Suppose that a fair coin is flipped 6 times in sequence and let X be the number of heads that show up. Draw Pascal s triangle down to the sixth row (recall

More information

CS 124 Math Review Section January 29, 2018

CS 124 Math Review Section January 29, 2018 CS 124 Math Review Section CS 124 is more math intensive than most of the introductory courses in the department. You re going to need to be able to do two things: 1. Perform some clever calculations to

More information

What can you prove by induction?

What can you prove by induction? MEI CONFERENCE 013 What can you prove by induction? Martyn Parker M.J.Parker@keele.ac.uk Contents Contents iii 1 Splitting Coins.................................................. 1 Convex Polygons................................................

More information

Midterm Exam 1 (Solutions)

Midterm Exam 1 (Solutions) EECS 6 Probability and Random Processes University of California, Berkeley: Spring 07 Kannan Ramchandran February 3, 07 Midterm Exam (Solutions) Last name First name SID Name of student on your left: Name

More information

Practice Problems Section Problems

Practice Problems Section Problems Practice Problems Section 4-4-3 4-4 4-5 4-6 4-7 4-8 4-10 Supplemental Problems 4-1 to 4-9 4-13, 14, 15, 17, 19, 0 4-3, 34, 36, 38 4-47, 49, 5, 54, 55 4-59, 60, 63 4-66, 68, 69, 70, 74 4-79, 81, 84 4-85,

More information

Part 4: Multi-parameter and normal models

Part 4: Multi-parameter and normal models Part 4: Multi-parameter and normal models 1 The normal model Perhaps the most useful (or utilized) probability model for data analysis is the normal distribution There are several reasons for this, e.g.,

More information

Lecture 12: Quality Control I: Control of Location

Lecture 12: Quality Control I: Control of Location Lecture 12: Quality Control I: Control of Location 10 October 2005 This lecture and the next will be about quality control methods. There are two reasons for this. First, it s intrinsically important for

More information

FINAL EXAM STAT 5201 Fall 2016

FINAL EXAM STAT 5201 Fall 2016 FINAL EXAM STAT 5201 Fall 2016 Due on the class Moodle site or in Room 313 Ford Hall on Tuesday, December 20 at 11:00 AM In the second case please deliver to the office staff of the School of Statistics

More information

All variances and covariances appearing in this formula are understood to be defined in the usual manner for finite populations; example

All variances and covariances appearing in this formula are understood to be defined in the usual manner for finite populations; example 155 By: UBIASED COMPOET RATIO ESTIMATIO1 D. S. Robson and Chitra Vitbayasai, Cornell University ITRODUCTIO The precision of a ratio -type estimator such as can sometimes be substantially increased the

More information

Probability and Statistics

Probability and Statistics Probability and Statistics Kristel Van Steen, PhD 2 Montefiore Institute - Systems and Modeling GIGA - Bioinformatics ULg kristel.vansteen@ulg.ac.be CHAPTER 4: IT IS ALL ABOUT DATA 4a - 1 CHAPTER 4: IT

More information

RVs and their probability distributions

RVs and their probability distributions RVs and their probability distributions RVs and their probability distributions In these notes, I will use the following notation: The probability distribution (function) on a sample space will be denoted

More information

A MODEL-BASED EVALUATION OF SEVERAL WELL-KNOWN VARIANCE ESTIMATORS FOR THE COMBINED RATIO ESTIMATOR

A MODEL-BASED EVALUATION OF SEVERAL WELL-KNOWN VARIANCE ESTIMATORS FOR THE COMBINED RATIO ESTIMATOR Statistica Sinica 8(1998), 1165-1173 A MODEL-BASED EVALUATION OF SEVERAL WELL-KNOWN VARIANCE ESTIMATORS FOR THE COMBINED RATIO ESTIMATOR Phillip S. Kott National Agricultural Statistics Service Abstract:

More information

EXAMINATION: QUANTITATIVE EMPIRICAL METHODS. Yale University. Department of Political Science

EXAMINATION: QUANTITATIVE EMPIRICAL METHODS. Yale University. Department of Political Science EXAMINATION: QUANTITATIVE EMPIRICAL METHODS Yale University Department of Political Science January 2014 You have seven hours (and fifteen minutes) to complete the exam. You can use the points assigned

More information

P (A B) P ((B C) A) P (B A) = P (B A) + P (C A) P (A) = P (B A) + P (C A) = Q(A) + Q(B).

P (A B) P ((B C) A) P (B A) = P (B A) + P (C A) P (A) = P (B A) + P (C A) = Q(A) + Q(B). Lectures 7-8 jacques@ucsdedu 41 Conditional Probability Let (Ω, F, P ) be a probability space Suppose that we have prior information which leads us to conclude that an event A F occurs Based on this information,

More information

3 Sampling Methods. 3.1 Preliminaries

3 Sampling Methods. 3.1 Preliminaries 3 Sampling Methods This chapter deals with estimation of population quantities in surveys with a known sampling design, specified (controlled) by the designer of the survey. Sampling theory treats the

More information

CS 246 Review of Proof Techniques and Probability 01/14/19

CS 246 Review of Proof Techniques and Probability 01/14/19 Note: This document has been adapted from a similar review session for CS224W (Autumn 2018). It was originally compiled by Jessica Su, with minor edits by Jayadev Bhaskaran. 1 Proof techniques Here we

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

CPSC 467: Cryptography and Computer Security

CPSC 467: Cryptography and Computer Security CPSC 467: Cryptography and Computer Security Michael J. Fischer Lecture 14 October 16, 2013 CPSC 467, Lecture 14 1/45 Message Digest / Cryptographic Hash Functions Hash Function Constructions Extending

More information

Coin tossing space. 0,1 consisting of all sequences (t n ) n N, represents the set of possible outcomes of tossing a coin infinitely many times.

Coin tossing space. 0,1 consisting of all sequences (t n ) n N, represents the set of possible outcomes of tossing a coin infinitely many times. Coin tossing space Think of a coin toss as a random choice from the two element set }. Thus the set } n represents the set of possible outcomes of n coin tosses, and Ω := } N, consisting of all sequences

More information

OPTIMAL CONTROLLED SAMPLING DESIGNS

OPTIMAL CONTROLLED SAMPLING DESIGNS OPTIMAL CONTROLLED SAMPLING DESIGNS Rajender Parsad and V.K. Gupta I.A.S.R.I., Library Avenue, New Delhi 002 rajender@iasri.res.in. Introduction Consider a situation, where it is desired to conduct a sample

More information

Discrete Mathematics and Probability Theory Fall 2012 Vazirani Note 14. Random Variables: Distribution and Expectation

Discrete Mathematics and Probability Theory Fall 2012 Vazirani Note 14. Random Variables: Distribution and Expectation CS 70 Discrete Mathematics and Probability Theory Fall 202 Vazirani Note 4 Random Variables: Distribution and Expectation Random Variables Question: The homeworks of 20 students are collected in, randomly

More information

Lecture 4: Training a Classifier

Lecture 4: Training a Classifier Lecture 4: Training a Classifier Roger Grosse 1 Introduction Now that we ve defined what binary classification is, let s actually train a classifier. We ll approach this problem in much the same way as

More information

Lesson 6 Population & Sampling

Lesson 6 Population & Sampling Lesson 6 Population & Sampling Lecturer: Dr. Emmanuel Adjei Department of Information Studies Contact Information: eadjei@ug.edu.gh College of Education School of Continuing and Distance Education 2014/2015

More information

X = X X n, + X 2

X = X X n, + X 2 CS 70 Discrete Mathematics for CS Fall 2003 Wagner Lecture 22 Variance Question: At each time step, I flip a fair coin. If it comes up Heads, I walk one step to the right; if it comes up Tails, I walk

More information

Importance Sampling Stratified Sampling. Lecture 6, autumn 2015 Mikael Amelin

Importance Sampling Stratified Sampling. Lecture 6, autumn 2015 Mikael Amelin Importance Sampling Stratified Sampling Lecture 6, autumn 2015 Mikael Amelin 1 Introduction All samples are treated equally in simple sampling. Sometimes it is possible to increase the accuracy by focusing

More information

EC969: Introduction to Survey Methodology

EC969: Introduction to Survey Methodology EC969: Introduction to Survey Methodology Peter Lynn Tues 1 st : Sample Design Wed nd : Non-response & attrition Tues 8 th : Weighting Focus on implications for analysis What is Sampling? Identify the

More information

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10

Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 EECS 70 Discrete Mathematics and Probability Theory Spring 2014 Anant Sahai Note 10 Introduction to Basic Discrete Probability In the last note we considered the probabilistic experiment where we flipped

More information

Introduction to Sample Survey

Introduction to Sample Survey Introduction to Sample Survey Girish Kumar Jha gjha_eco@iari.res.in Indian Agricultural Research Institute, New Delhi-12 INTRODUCTION Statistics is defined as a science which deals with collection, compilation,

More information

O.O. DAWODU & A.A. Adewara Department of Statistics, University of Ilorin, Ilorin, Nigeria

O.O. DAWODU & A.A. Adewara Department of Statistics, University of Ilorin, Ilorin, Nigeria Efficiency of Alodat Sample Selection Procedure over Sen - Midzuno and Yates - Grundy Draw by Draw under Unequal Probability Sampling without Replacement Sample Size 2 O.O. DAWODU & A.A. Adewara Department

More information

ASYMPTOTIC NORMALITY UNDER TWO-PHASE SAMPLING DESIGNS

ASYMPTOTIC NORMALITY UNDER TWO-PHASE SAMPLING DESIGNS Statistica Sinica 17(2007), 1047-1064 ASYMPTOTIC NORMALITY UNDER TWO-PHASE SAMPLING DESIGNS Jiahua Chen and J. N. K. Rao University of British Columbia and Carleton University Abstract: Large sample properties

More information

HYPERGEOMETRIC and NEGATIVE HYPERGEOMETIC DISTRIBUTIONS

HYPERGEOMETRIC and NEGATIVE HYPERGEOMETIC DISTRIBUTIONS HYPERGEOMETRIC and NEGATIVE HYPERGEOMETIC DISTRIBUTIONS A The Hypergeometric Situation: Sampling without Replacement In the section on Bernoulli trials [top of page 3 of those notes], it was indicated

More information

MAT 271E Probability and Statistics

MAT 271E Probability and Statistics MAT 271E Probability and Statistics Spring 2011 Instructor : Class Meets : Office Hours : Textbook : Supp. Text : İlker Bayram EEB 1103 ibayram@itu.edu.tr 13.30 16.30, Wednesday EEB? 10.00 12.00, Wednesday

More information

Randomized Algorithms

Randomized Algorithms Randomized Algorithms Prof. Tapio Elomaa tapio.elomaa@tut.fi Course Basics A new 4 credit unit course Part of Theoretical Computer Science courses at the Department of Mathematics There will be 4 hours

More information

Data Integration for Big Data Analysis for finite population inference

Data Integration for Big Data Analysis for finite population inference for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation

More information

Module 1. Probability

Module 1. Probability Module 1 Probability 1. Introduction In our daily life we come across many processes whose nature cannot be predicted in advance. Such processes are referred to as random processes. The only way to derive

More information

Formalizing Probability. Choosing the Sample Space. Probability Measures

Formalizing Probability. Choosing the Sample Space. Probability Measures Formalizing Probability Choosing the Sample Space What do we assign probability to? Intuitively, we assign them to possible events (things that might happen, outcomes of an experiment) Formally, we take

More information

Lecture 5: Sampling Methods

Lecture 5: Sampling Methods Lecture 5: Sampling Methods What is sampling? Is the process of selecting part of a larger group of participants with the intent of generalizing the results from the smaller group, called the sample, to

More information

Physics 6720 Introduction to Statistics April 4, 2017

Physics 6720 Introduction to Statistics April 4, 2017 Physics 6720 Introduction to Statistics April 4, 2017 1 Statistics of Counting Often an experiment yields a result that can be classified according to a set of discrete events, giving rise to an integer

More information

3.2 Probability Rules

3.2 Probability Rules 3.2 Probability Rules The idea of probability rests on the fact that chance behavior is predictable in the long run. In the last section, we used simulation to imitate chance behavior. Do we always need

More information

Lecture Notes 1: Vector spaces

Lecture Notes 1: Vector spaces Optimization-based data analysis Fall 2017 Lecture Notes 1: Vector spaces In this chapter we review certain basic concepts of linear algebra, highlighting their application to signal processing. 1 Vector

More information

Model Assisted Survey Sampling

Model Assisted Survey Sampling Carl-Erik Sarndal Jan Wretman Bengt Swensson Model Assisted Survey Sampling Springer Preface v PARTI Principles of Estimation for Finite Populations and Important Sampling Designs CHAPTER 1 Survey Sampling

More information

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 20

Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 20 CS 70 Discrete Mathematics for CS Spring 2007 Luca Trevisan Lecture 20 Today we shall discuss a measure of how close a random variable tends to be to its expectation. But first we need to see how to compute

More information