Chapter 3: Element sampling design: Part 1

Size: px
Start display at page:

Download "Chapter 3: Element sampling design: Part 1"

Transcription

1 Chapter 3: Element sampling design: Part 1 Jae-Kwang Kim Fall, 2014

2 Simple random sampling 1 Simple random sampling 2 SRS with replacement 3 Systematic sampling Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

3 Simple random sampling Simple Random Sampling Motivation: Choose n units from N units without replacement. 1 Each subset of n distinct units is equally likely to be selected. 2 There are ( N n) samples of size n from N. 3 Give equal probability of selection to each subset with n units. Definition Sampling design for SRS: / (N ) 1 n if A = n P(A) = 0 otherwise. Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

4 Simple random sampling Lemma Under SRS, the inclusion probabilities are π i = n/n π ij = n (n 1) N (N 1) for i j. Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

5 Simple random sampling Theorem Under SRS design, the HT estimator Ŷ HT = N n y i = Nȳ i A is unbiased for Y and has variance of the form where V S 2 = N N 1 (ŶHT ) N N i=1 j=1 = N2 n ( 1 n ) S 2 N (y i y j ) 2 = 1 N 1 N ( yi Ȳ ) 2. i=1 Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

6 Simple random sampling Theorem (Cont d) Also, the SYG variance estimator is where Thus, under SRS ˆV (ŶHT ) s 2 = 1 n 1 = N2 n ( 1 n ) s 2 N (y i ȳ) 2. i A E(s 2 ) = S 2. Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

7 Simple random sampling Remark (under SRS) 1 n/n is often called the finite population correction (FPC) term. The FPC term can be ignored (FPC. = 1) if the sampling rate n/n is small ( 0.05) or for conservative inference. For n = 1, the variance of the sample mean is 1 ( 1 n ) S 2 = 1 n N N N ( yi Ȳ ) 2 σ 2 Y i=1 Central limit theorem: under some conditions, ˆV ( ) 1/2 Ŷ HT Y = 1 n ȳ Ȳ ( ) N (0, 1). 1 n N S 2 Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

8 Simple random sampling Remark (under SRS) Sample size determination 1 Choose the target variance V of V (ȳ). 2 Choose n the smallest integer satisfying 1 ( 1 n ) S 2 V. n N For dichotomous y (taking 0 or 1), may use S 2. = P(1 P) 1/4. A simple rule is n d 2, where d is the margin of error. Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

9 Simple random sampling How to select a simple random sample of size n from the finite population? Draw-by-draw procedure Rejective Bernoulli sampling method Sample Reservoir method Random sorting method Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

10 Simple random sampling Draw-by-draw procedure For example, consider U = {1, 2,, N} and n = 2. In the first draw, select one element with equal probability. In the second draw, select one element with equal probability from U {a 1 } where a 1 is the element selected from the first draw. Let a 2 be the element selected from the second draw. Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

11 Simple random sampling Draw-by-draw procedure (Cont d) P(a 1, a 2 ) = P(a 1 )P(a 2 U {a 1 }) + P(a 2 )P(a 1 U {a 2 }) = = 2 N(N 1). We can prove similar results for general n. (Use mathematical induction). Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

12 Simple random sampling Rejective Bernoulli sampling method 1 Apply Bernoulli sampling of expected size n. where f = n/n. I 1,, I N Bernoulli(f ) 2 Check if the realized sample size is n. If yes, accept the sample. Otherwise, goto Step 1. Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

13 Simple random sampling Rejective Bernoulli sampling method (Cont d) Justification: ( P I 1, I 2,, I N ) N I i = n i=1 = = 1 ( N n) N i=1 f I i (1 f ) 1 I i ( N ) n f n (1 f ) N n if N i=1 I i = n. Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

14 Simple random sampling Reservoir method (McLeod and Bellhouse, 1983) 1 The first n units are selected into the sample. 2 For each k = n + 1,, N: 1 Select k with probability n/k. 2 If unit k is selected, remove one element from the current sample with equal probability. 3 Unit k takes the place of the removed unit. Note that the population size is not necessarily known. You can stop any time point of the process then you will obtain a simple random sample from the finite population considered up to that time point. Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

15 Simple random sampling Random sorting method 1 A value of an independent uniform variable in [0,1] is allocated to each unit of the population. 2 The population is sorted in ascending (or descending) order. 3 The first n units of the sorted population are selected in the sample. Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

16 SRS with replacement 1 Simple random sampling 2 SRS with replacement 3 Systematic sampling Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

17 SRS with replacement In with-replacement sampling, order of the sample selection is important. Ordered sample OS = (a 1, a 2,, a n ) where a i is the index of the element in the i-th with-replacement sampling. Sample: A = {k; k = a i for some i, i = 1, 2,, m} SRS with replacement: For each i-th draw, we use a i = k with probability 1/N, k = 1,, N. Sample size is random variable: Note that π k = Pr (k A) = 1 Pr (k / A) ( = ) n N Thus, n 0 = N k=1 π k = N N ( 1 N 1) n n for n > 2. Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

18 SRS with replacement 1 First, define Z i = y ai = N y k I (a i = k). k=1 Note that Z 1,, Z n are independent random variables since the n draws are independent. 2 Z 1,, Z m are identically distributed since the same probabilities are used at each draw, where E (Z i ) = Ȳ and V (Z i ) = N 1 N k=1 ( yk Ȳ ) 2 σ 2 y. 3 Thus, Z 1,, Z m are IID with mean Ȳ and variance σ2 y. Use z = n k=1 Z k/n to estimate Ȳ. Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

19 SRS with replacement Estimation of Total Unbiased estimator of Y : Variance V (ŶSRSWR ) = N2 n Ŷ SRSWR = N n n y ai = Nȳ n. i=1 ( 1 1 ) S 2 = N2 N n σ2 y V (Ŷ SRS ) where S 2 = (N 1) 1 N i=1 (y i Ȳ N ) 2 = N(N 1) 1 σ 2 y. Variance estimation ˆV (ŶSRSWR ) = N2 n s2 where s 2 = (n 1) 1 n i=1 (y a i ȳ n ) 2. Note that E(s 2 ) = σ 2 y. Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

20 Systematic sampling 1 Simple random sampling 2 SRS with replacement 3 Systematic sampling Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

21 Systematic sampling Setup: 1 Have N elements in a list. 2 Choose a positive integer, a, called sampling interval. Let n = [N/a]. That is, N = na + c, where c is an integer 0 c < a. 3 Select a random start, r, from {1, 2,, a} with equal probability. 4 The final sample is A = {r, r + a, r + 2a,, r + (n 1)a}, if c < r a = {r, r + a, r + 2a,, r + na}, if 1 r c. Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

22 Systematic sampling Sample size can be random { n if c < r a n A = n + 1 if r c Inclusion probabilities π k = π kl = Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

23 Systematic sampling Remark This is very easy to do. This is a probability sampling design. This is not measurable sampling design: No design-unbiased estimator of variance (because only one random draw) Pick one set of elements (which always go together) & measure each one: Later, we will call this cluster sampling. Divide population into non-overlapping groups & choose an element in each group: closely related to stratification. Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

24 Systematic sampling Estimation Partition the population into a groups where U i : disjoint Population total where t r = k U r y k. Y = i U U = U 1 U 2 U a y i = a r=1 k U r y k = a r=1 Think of finite population with a elements with measurements t 1,, t a. t r Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

25 Systematic sampling Estimation (Cont d) HT estimator: if A = U r. Ŷ HT = t r 1/a, Variance: Note that we are doing SRS from the population of a elements {t 1,, t a }. ) ( Var (ŶHT = a2 1 1 ) St 2 1 a where S 2 t = 1 a 1 and t = a r=1 t r /a. When the variance is small? a (t r t) 2 r=1 Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

26 Systematic sampling Estimation (Cont d) Now, assuming N = na V (ŶHT ) = a (a 1) St 2 a = n 2 a (ȳ r ȳ u ) 2 r=1 where ȳ r = t r /n and ȳ u = t/n. ANOVA: U = a r=1 U r SST = a (y k ȳ u ) 2 = (y k ȳ u ) 2 k U r=1 k U r a a = (y k ȳ r ) 2 + n (ȳ r ȳ u ) 2 r=1 k U r r=1 = SSW + SSB. Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

27 Systematic sampling V (ŶHT ) = na SSB = N SSB = N (SST SSW ). If SSB is small, then ȳ r are more alike and V If SSW is small, then V (ŶHT ) is large. (ŶHT ) is small. Intraclass correlation coefficient ρ measures homogeniety of clusters. ρ = 1 n SSW n 1 SST More details about ρ will be covered in the cluster sampling. (Chapter 6). Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

28 Systematic sampling Comparison between systematic sampling (SY) and SRS How does SY compare to SRS when the population is sorted by the following way? 1 Random ordering: Intuitively should be the same 2 Linear ordering: SY should be better than SRS 3 Periodic ordering: if period = a, SY can be terrible. 4 Autocorrelated order: Successive y k s tend to lie on the same side of ȳ u. Thus, SY should be better than SRS. Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

29 Systematic sampling How to quantify? : V SRS (ŶHT ) V SY (ŶHT ) = N2 n = n 2 a ( 1 n ) 1 N N 1 a (ȳ r ȳ u ) 2 r=1 N ( ) 2 yk Ȳ N k=1 Cochran (1946) introduced superpopulation model to deal with this problem. (treat y k as a random variable) Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

30 Systematic sampling Example: Superpopulation model for a population in random order. Denote the model by ζ: {y k } iid ( µ, σ 2) E ζ { V SRS (ŶHT )} E ζ { V SY (ŶHT )} = N2 n = N2 n ( 1 n ) σ 2 N ( 1 n ) σ 2 N Thus, the model expectations of the design variances are the same under the IID model. Kim Ch. 3: Element sampling design: Part 1 Fall, / 31

Cluster Sampling 2. Chapter Introduction

Cluster Sampling 2. Chapter Introduction Chapter 7 Cluster Sampling 7.1 Introduction In this chapter, we consider two-stage cluster sampling where the sample clusters are selected in the first stage and the sample elements are selected in the

More information

Chapter 8: Estimation 1

Chapter 8: Estimation 1 Chapter 8: Estimation 1 Jae-Kwang Kim Iowa State University Fall, 2014 Kim (ISU) Ch. 8: Estimation 1 Fall, 2014 1 / 33 Introduction 1 Introduction 2 Ratio estimation 3 Regression estimator Kim (ISU) Ch.

More information

Bootstrap inference for the finite population total under complex sampling designs

Bootstrap inference for the finite population total under complex sampling designs Bootstrap inference for the finite population total under complex sampling designs Zhonglei Wang (Joint work with Dr. Jae Kwang Kim) Center for Survey Statistics and Methodology Iowa State University Jan.

More information

Nonresponse weighting adjustment using estimated response probability

Nonresponse weighting adjustment using estimated response probability Nonresponse weighting adjustment using estimated response probability Jae-kwang Kim Yonsei University, Seoul, Korea December 26, 2006 Introduction Nonresponse Unit nonresponse Item nonresponse Basic strategy

More information

6. Fractional Imputation in Survey Sampling

6. Fractional Imputation in Survey Sampling 6. Fractional Imputation in Survey Sampling 1 Introduction Consider a finite population of N units identified by a set of indices U = {1, 2,, N} with N known. Associated with each unit i in the population

More information

Unequal Probability Designs

Unequal Probability Designs Unequal Probability Designs Department of Statistics University of British Columbia This is prepares for Stat 344, 2014 Section 7.11 and 7.12 Probability Sampling Designs: A quick review A probability

More information

of being selected and varying such probability across strata under optimal allocation leads to increased accuracy.

of being selected and varying such probability across strata under optimal allocation leads to increased accuracy. 5 Sampling with Unequal Probabilities Simple random sampling and systematic sampling are schemes where every unit in the population has the same chance of being selected We will now consider unequal probability

More information

Ch 2: Simple Linear Regression

Ch 2: Simple Linear Regression Ch 2: Simple Linear Regression 1. Simple Linear Regression Model A simple regression model with a single regressor x is y = β 0 + β 1 x + ɛ, where we assume that the error ɛ is independent random component

More information

Data Integration for Big Data Analysis for finite population inference

Data Integration for Big Data Analysis for finite population inference for Big Data Analysis for finite population inference Jae-kwang Kim ISU January 23, 2018 1 / 36 What is big data? 2 / 36 Data do not speak for themselves Knowledge Reproducibility Information Intepretation

More information

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data Jae-Kwang Kim 1 Iowa State University June 28, 2012 1 Joint work with Dr. Ming Zhou (when he was a PhD student at ISU)

More information

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach

Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Statistical Methods for Handling Incomplete Data Chapter 2: Likelihood-based approach Jae-Kwang Kim Department of Statistics, Iowa State University Outline 1 Introduction 2 Observed likelihood 3 Mean Score

More information

Stat472/572 Sampling: Theory and Practice Instructor: Yan Lu

Stat472/572 Sampling: Theory and Practice Instructor: Yan Lu Stat472/572 Sampling: Theory and Practice Instructor: Yan Lu 1 Chapter 5 Cluster Sampling with Equal Probability Example: Sampling students in high school. Take a random sample of n classes (The classes

More information

Chapter 2: Simple Random Sampling and a Brief Review of Probability

Chapter 2: Simple Random Sampling and a Brief Review of Probability Chapter 2: Simple Random Sampling and a Brief Review of Probability Forest Before the Trees Chapters 2-6 primarily investigate survey analysis. We begin with the basic analyses: Those that differ according

More information

Confidence Intervals, Testing and ANOVA Summary

Confidence Intervals, Testing and ANOVA Summary Confidence Intervals, Testing and ANOVA Summary 1 One Sample Tests 1.1 One Sample z test: Mean (σ known) Let X 1,, X n a r.s. from N(µ, σ) or n > 30. Let The test statistic is H 0 : µ = µ 0. z = x µ 0

More information

Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities

Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Conservative variance estimation for sampling designs with zero pairwise inclusion probabilities Peter M. Aronow and Cyrus Samii Forthcoming at Survey Methodology Abstract We consider conservative variance

More information

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling Jae-Kwang Kim 1 Iowa State University June 26, 2013 1 Joint work with Shu Yang Introduction 1 Introduction

More information

Fractional Imputation in Survey Sampling: A Comparative Review

Fractional Imputation in Survey Sampling: A Comparative Review Fractional Imputation in Survey Sampling: A Comparative Review Shu Yang Jae-Kwang Kim Iowa State University Joint Statistical Meetings, August 2015 Outline Introduction Fractional imputation Features Numerical

More information

Formal Statement of Simple Linear Regression Model

Formal Statement of Simple Linear Regression Model Formal Statement of Simple Linear Regression Model Y i = β 0 + β 1 X i + ɛ i Y i value of the response variable in the i th trial β 0 and β 1 are parameters X i is a known constant, the value of the predictor

More information

Combining Non-probability and Probability Survey Samples Through Mass Imputation

Combining Non-probability and Probability Survey Samples Through Mass Imputation Combining Non-probability and Probability Survey Samples Through Mass Imputation Jae-Kwang Kim 1 Iowa State University & KAIST October 27, 2018 1 Joint work with Seho Park, Yilin Chen, and Changbao Wu

More information

Main sampling techniques

Main sampling techniques Main sampling techniques ELSTAT Training Course January 23-24 2017 Martin Chevalier Department of Statistical Methods Insee 1 / 187 Main sampling techniques Outline Sampling theory Simple random sampling

More information

Chapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70

Chapter 5: Models used in conjunction with sampling. J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Chapter 5: Models used in conjunction with sampling J. Kim, W. Fuller (ISU) Chapter 5: Models used in conjunction with sampling 1 / 70 Nonresponse Unit Nonresponse: weight adjustment Item Nonresponse:

More information

A comparison of stratified simple random sampling and sampling with probability proportional to size

A comparison of stratified simple random sampling and sampling with probability proportional to size A comparison of stratified simple random sampling and sampling with probability proportional to size Edgar Bueno Dan Hedlin Per Gösta Andersson 1 Introduction When planning the sampling strategy (i.e.

More information

Bias Variance Trade-off

Bias Variance Trade-off Bias Variance Trade-off The mean squared error of an estimator MSE(ˆθ) = E([ˆθ θ] 2 ) Can be re-expressed MSE(ˆθ) = Var(ˆθ) + (B(ˆθ) 2 ) MSE = VAR + BIAS 2 Proof MSE(ˆθ) = E((ˆθ θ) 2 ) = E(([ˆθ E(ˆθ)]

More information

Bayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples

Bayesian inference for sample surveys. Roderick Little Module 2: Bayesian models for simple random samples Bayesian inference for sample surveys Roderick Little Module : Bayesian models for simple random samples Superpopulation Modeling: Estimating parameters Various principles: least squares, method of moments,

More information

EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 19

EEC 686/785 Modeling & Performance Evaluation of Computer Systems. Lecture 19 EEC 686/785 Modeling & Performance Evaluation of Computer Systems Lecture 19 Department of Electrical and Computer Engineering Cleveland State University wenbing@ieee.org (based on Dr. Raj Jain s lecture

More information

ECON Introductory Econometrics. Lecture 2: Review of Statistics

ECON Introductory Econometrics. Lecture 2: Review of Statistics ECON415 - Introductory Econometrics Lecture 2: Review of Statistics Monique de Haan (moniqued@econ.uio.no) Stock and Watson Chapter 2-3 Lecture outline 2 Simple random sampling Distribution of the sample

More information

REMAINDER LINEAR SYSTEMATIC SAMPLING

REMAINDER LINEAR SYSTEMATIC SAMPLING Sankhyā : The Indian Journal of Statistics 2000, Volume 62, Series B, Pt. 2, pp. 249 256 REMAINDER LINEAR SYSTEMATIC SAMPLING By HORNG-JINH CHANG and KUO-CHUNG HUANG Tamkang University, Taipei SUMMARY.

More information

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is

Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Lecture 15 Multiple regression I Chapter 6 Set 2 Least Square Estimation The quadratic form to be minimized is Q = (Y i β 0 β 1 X i1 β 2 X i2 β p 1 X i.p 1 ) 2, which in matrix notation is Q = (Y Xβ) (Y

More information

Comment on Article by Scutari

Comment on Article by Scutari Bayesian Analysis (2013) 8, Number 3, pp. 543 548 Comment on Article by Scutari Hao Wang Scutari s paper studies properties of the distribution of graphs ppgq. This is an interesting angle because it differs

More information

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics

TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics TABLES AND FORMULAS FOR MOORE Basic Practice of Statistics Exploring Data: Distributions Look for overall pattern (shape, center, spread) and deviations (outliers). Mean (use a calculator): x = x 1 + x

More information

A measurement error model approach to small area estimation

A measurement error model approach to small area estimation A measurement error model approach to small area estimation Jae-kwang Kim 1 Spring, 2015 1 Joint work with Seunghwan Park and Seoyoung Kim Ouline Introduction Basic Theory Application to Korean LFS Discussion

More information

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7

MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 MA 575 Linear Models: Cedric E. Ginestet, Boston University Midterm Review Week 7 1 Random Vectors Let a 0 and y be n 1 vectors, and let A be an n n matrix. Here, a 0 and A are non-random, whereas y is

More information

ICES training Course on Design and Analysis of Statistically Sound Catch Sampling Programmes

ICES training Course on Design and Analysis of Statistically Sound Catch Sampling Programmes ICES training Course on Design and Analysis of Statistically Sound Catch Sampling Programmes Sara-Jane Moore www.marine.ie General Statistics - backed up by case studies General Introduction to sampling

More information

Linear models and their mathematical foundations: Simple linear regression

Linear models and their mathematical foundations: Simple linear regression Linear models and their mathematical foundations: Simple linear regression Steffen Unkel Department of Medical Statistics University Medical Center Göttingen, Germany Winter term 2018/19 1/21 Introduction

More information

CSE 421 Greedy Algorithms / Interval Scheduling

CSE 421 Greedy Algorithms / Interval Scheduling CSE 421 Greedy Algorithms / Interval Scheduling Yin Tat Lee 1 Interval Scheduling Job j starts at s(j) and finishes at f(j). Two jobs compatible if they don t overlap. Goal: find maximum subset of mutually

More information

arxiv: v2 [stat.me] 11 Apr 2017

arxiv: v2 [stat.me] 11 Apr 2017 Sampling Designs on Finite Populations with Spreading Control Parameters Yves Tillé, University of Neuchâtel Lionel Qualité, Swiss Federal Office of Statistics and University of Neuchâtel Matthieu Wilhelm,

More information

CS5314 Randomized Algorithms. Lecture 18: Probabilistic Method (De-randomization, Sample-and-Modify)

CS5314 Randomized Algorithms. Lecture 18: Probabilistic Method (De-randomization, Sample-and-Modify) CS5314 Randomized Algorithms Lecture 18: Probabilistic Method (De-randomization, Sample-and-Modify) 1 Introduce two topics: De-randomize by conditional expectation provides a deterministic way to construct

More information

REMAINDER LINEAR SYSTEMATIC SAMPLING WITH MULTIPLE RANDOM STARTS

REMAINDER LINEAR SYSTEMATIC SAMPLING WITH MULTIPLE RANDOM STARTS REMAINDER LINEAR SYSTEMATIC SAMPLING WITH MULTIPLE RANDOM STARTS By SAYED A. MOSTAFA ABDELMEGEED Bachelor of Science in Statistics Cairo University Cairo, Egypt 2010 Submitted to the Faculty of the Graduate

More information

Chapter 2 Inferences in Simple Linear Regression

Chapter 2 Inferences in Simple Linear Regression STAT 525 SPRING 2018 Chapter 2 Inferences in Simple Linear Regression Professor Min Zhang Testing for Linear Relationship Term β 1 X i defines linear relationship Will then test H 0 : β 1 = 0 Test requires

More information

Ch 3: Multiple Linear Regression

Ch 3: Multiple Linear Regression Ch 3: Multiple Linear Regression 1. Multiple Linear Regression Model Multiple regression model has more than one regressor. For example, we have one response variable and two regressor variables: 1. delivery

More information

Review of Statistics

Review of Statistics Review of Statistics Topics Descriptive Statistics Mean, Variance Probability Union event, joint event Random Variables Discrete and Continuous Distributions, Moments Two Random Variables Covariance and

More information

Performance Evaluation and Comparison

Performance Evaluation and Comparison Outline Hong Chang Institute of Computing Technology, Chinese Academy of Sciences Machine Learning Methods (Fall 2012) Outline Outline I 1 Introduction 2 Cross Validation and Resampling 3 Interval Estimation

More information

Statistical Inference

Statistical Inference Statistical Inference Classical and Bayesian Methods Class 7 AMS-UCSC Tue 31, 2012 Winter 2012. Session 1 (Class 7) AMS-132/206 Tue 31, 2012 1 / 13 Topics Topics We will talk about... 1 Hypothesis testing

More information

Remedial Measures, Brown-Forsythe test, F test

Remedial Measures, Brown-Forsythe test, F test Remedial Measures, Brown-Forsythe test, F test Dr. Frank Wood Frank Wood, fwood@stat.columbia.edu Linear Regression Models Lecture 7, Slide 1 Remedial Measures How do we know that the regression function

More information

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2

Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Lecture 2: Basic Concepts and Simple Comparative Experiments Montgomery: Chapter 2 Fall, 2013 Page 1 Random Variable and Probability Distribution Discrete random variable Y : Finite possible values {y

More information

2. AXIOMATIC PROBABILITY

2. AXIOMATIC PROBABILITY IA Probability Lent Term 2. AXIOMATIC PROBABILITY 2. The axioms The formulation for classical probability in which all outcomes or points in the sample space are equally likely is too restrictive to develop

More information

AN UNDERGRADUATE LECTURE ON THE CENTRAL LIMIT THEOREM

AN UNDERGRADUATE LECTURE ON THE CENTRAL LIMIT THEOREM AN UNDERGRADUATE LECTURE ON THE CENTRAL LIMIT THEOREM N.V. KRYLOV In the first section we explain why the central limit theorem for the binomial 1/2 distributions is natural. The second section contains

More information

Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A.

Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A. Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A. Keywords: Survey sampling, finite populations, simple random sampling, systematic

More information

Compatible probability measures

Compatible probability measures Coin tossing space Think of a coin toss as a random choice from the two element set {0,1}. Thus the set {0,1} n represents the set of possible outcomes of n coin tosses, and Ω := {0,1} N, consisting of

More information

1 Statistical inference for a population mean

1 Statistical inference for a population mean 1 Statistical inference for a population mean 1. Inference for a large sample, known variance Suppose X 1,..., X n represents a large random sample of data from a population with unknown mean µ and known

More information

The ESS Sample Design Data File (SDDF)

The ESS Sample Design Data File (SDDF) The ESS Sample Design Data File (SDDF) Documentation Version 1.0 Matthias Ganninger Tel: +49 (0)621 1246 282 E-Mail: matthias.ganninger@gesis.org April 8, 2008 Summary: This document reports on the creation

More information

Introduction to Machine Learning. Lecture 2

Introduction to Machine Learning. Lecture 2 Introduction to Machine Learning Lecturer: Eran Halperin Lecture 2 Fall Semester Scribe: Yishay Mansour Some of the material was not presented in class (and is marked with a side line) and is given for

More information

Section 27. The Central Limit Theorem. Po-Ning Chen, Professor. Institute of Communications Engineering. National Chiao Tung University

Section 27. The Central Limit Theorem. Po-Ning Chen, Professor. Institute of Communications Engineering. National Chiao Tung University Section 27 The Central Limit Theorem Po-Ning Chen, Professor Institute of Communications Engineering National Chiao Tung University Hsin Chu, Taiwan 3000, R.O.C. Identically distributed summands 27- Central

More information

Mathematics for Economics MA course

Mathematics for Economics MA course Mathematics for Economics MA course Simple Linear Regression Dr. Seetha Bandara Simple Regression Simple linear regression is a statistical method that allows us to summarize and study relationships between

More information

Monte-Carlo MMD-MA, Université Paris-Dauphine. Xiaolu Tan

Monte-Carlo MMD-MA, Université Paris-Dauphine. Xiaolu Tan Monte-Carlo MMD-MA, Université Paris-Dauphine Xiaolu Tan tan@ceremade.dauphine.fr Septembre 2015 Contents 1 Introduction 1 1.1 The principle.................................. 1 1.2 The error analysis

More information

Unit 7: Random Effects, Subsampling, Nested and Crossed Factor Designs

Unit 7: Random Effects, Subsampling, Nested and Crossed Factor Designs Unit 7: Random Effects, Subsampling, Nested and Crossed Factor Designs STA 643: Advanced Experimental Design Derek S. Young 1 Learning Objectives Understand how to interpret a random effect Know the different

More information

Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests

Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests Chapter 8: Hypothesis Testing Lecture 9: Likelihood ratio tests Throughout this chapter we consider a sample X taken from a population indexed by θ Θ R k. Instead of estimating the unknown parameter, we

More information

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics

Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Chapter 2: Fundamentals of Statistics Lecture 15: Models and statistics Data from one or a series of random experiments are collected. Planning experiments and collecting data (not discussed here). Analysis:

More information

Outline. Simulation of a Single-Server Queueing System. EEC 686/785 Modeling & Performance Evaluation of Computer Systems.

Outline. Simulation of a Single-Server Queueing System. EEC 686/785 Modeling & Performance Evaluation of Computer Systems. EEC 686/785 Modeling & Performance Evaluation of Computer Systems Lecture 19 Outline Simulation of a Single-Server Queueing System Review of midterm # Department of Electrical and Computer Engineering

More information

Lecture 1 Linear Regression with One Predictor Variable.p2

Lecture 1 Linear Regression with One Predictor Variable.p2 Lecture Linear Regression with One Predictor Variablep - Basics - Meaning of regression parameters p - β - the slope of the regression line -it indicates the change in mean of the probability distn of

More information

STA304H1F/1003HF Summer 2015: Lecture 11

STA304H1F/1003HF Summer 2015: Lecture 11 STA304H1F/1003HF Summer 2015: Lecture 11 You should know... What is one-stage vs two-stage cluster sampling? What are primary and secondary sampling units? What are the two types of estimation in cluster

More information

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics

Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics Test Code: STA/STB (Short Answer Type) 2013 Junior Research Fellowship for Research Course in Statistics The candidates for the research course in Statistics will have to take two shortanswer type tests

More information

Sets, Functions and Relations

Sets, Functions and Relations Chapter 2 Sets, Functions and Relations A set is any collection of distinct objects. Here is some notation for some special sets of numbers: Z denotes the set of integers (whole numbers), that is, Z =

More information

Multiple Regression Analysis: Heteroskedasticity

Multiple Regression Analysis: Heteroskedasticity Multiple Regression Analysis: Heteroskedasticity y = β 0 + β 1 x 1 + β x +... β k x k + u Read chapter 8. EE45 -Chaiyuth Punyasavatsut 1 topics 8.1 Heteroskedasticity and OLS 8. Robust estimation 8.3 Testing

More information

Chapter 4. Replication Variance Estimation. J. Kim, W. Fuller (ISU) Chapter 4 7/31/11 1 / 28

Chapter 4. Replication Variance Estimation. J. Kim, W. Fuller (ISU) Chapter 4 7/31/11 1 / 28 Chapter 4 Replication Variance Estimation J. Kim, W. Fuller (ISU) Chapter 4 7/31/11 1 / 28 Jackknife Variance Estimation Create a new sample by deleting one observation n 1 n n ( x (k) x) 2 = x (k) = n

More information

Master s Written Examination

Master s Written Examination Master s Written Examination Option: Statistics and Probability Spring 05 Full points may be obtained for correct answers to eight questions Each numbered question (which may have several parts) is worth

More information

From the help desk: It s all about the sampling

From the help desk: It s all about the sampling The Stata Journal (2002) 2, Number 2, pp. 90 20 From the help desk: It s all about the sampling Allen McDowell Stata Corporation amcdowell@stata.com Jeff Pitblado Stata Corporation jsp@stata.com Abstract.

More information

Introduction to Randomized Algorithms: Quick Sort and Quick Selection

Introduction to Randomized Algorithms: Quick Sort and Quick Selection Chapter 14 Introduction to Randomized Algorithms: Quick Sort and Quick Selection CS 473: Fundamental Algorithms, Spring 2011 March 10, 2011 14.1 Introduction to Randomized Algorithms 14.2 Introduction

More information

Two-phase sampling approach to fractional hot deck imputation

Two-phase sampling approach to fractional hot deck imputation Two-phase sampling approach to fractional hot deck imputation Jongho Im 1, Jae-Kwang Kim 1 and Wayne A. Fuller 1 Abstract Hot deck imputation is popular for handling item nonresponse in survey sampling.

More information

Introduction 1. STA442/2101 Fall See last slide for copyright information. 1 / 33

Introduction 1. STA442/2101 Fall See last slide for copyright information. 1 / 33 Introduction 1 STA442/2101 Fall 2016 1 See last slide for copyright information. 1 / 33 Background Reading Optional Chapter 1 of Linear models with R Chapter 1 of Davison s Statistical models: Data, and

More information

Multiple comparisons - subsequent inferences for two-way ANOVA

Multiple comparisons - subsequent inferences for two-way ANOVA 1 Multiple comparisons - subsequent inferences for two-way ANOVA the kinds of inferences to be made after the F tests of a two-way ANOVA depend on the results if none of the F tests lead to rejection of

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = β 0 + β 1 x 1 + β 2 x 2 +... β k x k + u 2. Inference 0 Assumptions of the Classical Linear Model (CLM)! So far, we know: 1. The mean and variance of the OLS estimators

More information

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review

STATS 200: Introduction to Statistical Inference. Lecture 29: Course review STATS 200: Introduction to Statistical Inference Lecture 29: Course review Course review We started in Lecture 1 with a fundamental assumption: Data is a realization of a random process. The goal throughout

More information

SURVEY SAMPLING. ijli~iiili~llil~~)~i"lij liilllill THEORY AND METHODS DANKIT K. NASSIUMA

SURVEY SAMPLING. ijli~iiili~llil~~)~ilij liilllill THEORY AND METHODS DANKIT K. NASSIUMA SURVEY SAMPLING THEORY AND METHODS DANKIT K. NASSIUMA ijli~iiili~llil~~)~i"lij liilllill 0501941 9 Table of Contents PREFACE 1 INTRODUCTION 1.1 Overview of researc h methods 1.2 Surveys and sampling 1.3

More information

One-Way Analysis of Variance. With regression, we related two quantitative, typically continuous variables.

One-Way Analysis of Variance. With regression, we related two quantitative, typically continuous variables. One-Way Analysis of Variance With regression, we related two quantitative, typically continuous variables. Often we wish to relate a quantitative response variable with a qualitative (or simply discrete)

More information

Statistical Inference

Statistical Inference Statistical Inference Classical and Bayesian Methods Class 6 AMS-UCSC Thu 26, 2012 Winter 2012. Session 1 (Class 6) AMS-132/206 Thu 26, 2012 1 / 15 Topics Topics We will talk about... 1 Hypothesis testing

More information

Lecture 4. f X T, (x t, ) = f X,T (x, t ) f T (t )

Lecture 4. f X T, (x t, ) = f X,T (x, t ) f T (t ) LECURE NOES 21 Lecture 4 7. Sufficient statistics Consider the usual statistical setup: the data is X and the paramter is. o gain information about the parameter we study various functions of the data

More information

Statistical Hypothesis Testing

Statistical Hypothesis Testing Statistical Hypothesis Testing Dr. Phillip YAM 2012/2013 Spring Semester Reference: Chapter 7 of Tests of Statistical Hypotheses by Hogg and Tanis. Section 7.1 Tests about Proportions A statistical hypothesis

More information

Optimal Estimation and Sampling Allocation in Survey Sampling Under a General Correlated Superpopulation Model

Optimal Estimation and Sampling Allocation in Survey Sampling Under a General Correlated Superpopulation Model Journal of Modern Applied Statistical Methods Volume 15 Issue 2 Article 20 11-1-2016 Optimal Estimation and Sampling Allocation in Survey Sampling Under a General Correlated Superpopulation Model Ioulia

More information

Paired comparisons. We assume that

Paired comparisons. We assume that To compare to methods, A and B, one can collect a sample of n pairs of observations. Pair i provides two measurements, Y Ai and Y Bi, one for each method: If we want to compare a reaction of patients to

More information

Confidence Intervals and Hypothesis Tests

Confidence Intervals and Hypothesis Tests Confidence Intervals and Hypothesis Tests STA 281 Fall 2011 1 Background The central limit theorem provides a very powerful tool for determining the distribution of sample means for large sample sizes.

More information

Evaluating Hypotheses

Evaluating Hypotheses Evaluating Hypotheses IEEE Expert, October 1996 1 Evaluating Hypotheses Sample error, true error Confidence intervals for observed hypothesis error Estimators Binomial distribution, Normal distribution,

More information

arxiv: v1 [stat.me] 16 Jun 2016

arxiv: v1 [stat.me] 16 Jun 2016 Causal Inference in Rebuilding and Extending the Recondite Bridge between Finite Population Sampling and Experimental arxiv:1606.05279v1 [stat.me] 16 Jun 2016 Design Rahul Mukerjee *, Tirthankar Dasgupta

More information

Lecture Notes 7 Random Processes. Markov Processes Markov Chains. Random Processes

Lecture Notes 7 Random Processes. Markov Processes Markov Chains. Random Processes Lecture Notes 7 Random Processes Definition IID Processes Bernoulli Process Binomial Counting Process Interarrival Time Process Markov Processes Markov Chains Classification of States Steady State Probabilities

More information

INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING

INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Statistica Sinica 24 (2014), 1001-1015 doi:http://dx.doi.org/10.5705/ss.2013.038 INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING Seunghwan Park and Jae Kwang Kim Seoul National Univeristy

More information

Multiple Linear Regression

Multiple Linear Regression Multiple Linear Regression Simple linear regression tries to fit a simple line between two variables Y and X. If X is linearly related to Y this explains some of the variability in Y. In most cases, there

More information

Multiple Regression Analysis

Multiple Regression Analysis Multiple Regression Analysis y = 0 + 1 x 1 + x +... k x k + u 6. Heteroskedasticity What is Heteroskedasticity?! Recall the assumption of homoskedasticity implied that conditional on the explanatory variables,

More information

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006

Hypothesis Testing. Part I. James J. Heckman University of Chicago. Econ 312 This draft, April 20, 2006 Hypothesis Testing Part I James J. Heckman University of Chicago Econ 312 This draft, April 20, 2006 1 1 A Brief Review of Hypothesis Testing and Its Uses values and pure significance tests (R.A. Fisher)

More information

Ch. 1: Data and Distributions

Ch. 1: Data and Distributions Ch. 1: Data and Distributions Populations vs. Samples How to graphically display data Histograms, dot plots, stem plots, etc Helps to show how samples are distributed Distributions of both continuous and

More information

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u

So far our focus has been on estimation of the parameter vector β in the. y = Xβ + u Interval estimation and hypothesis tests So far our focus has been on estimation of the parameter vector β in the linear model y i = β 1 x 1i + β 2 x 2i +... + β K x Ki + u i = x iβ + u i for i = 1, 2,...,

More information

Simple Linear Regression

Simple Linear Regression Simple Linear Regression In simple linear regression we are concerned about the relationship between two variables, X and Y. There are two components to such a relationship. 1. The strength of the relationship.

More information

Weighting in survey analysis under informative sampling

Weighting in survey analysis under informative sampling Jae Kwang Kim and Chris J. Skinner Weighting in survey analysis under informative sampling Article (Accepted version) (Refereed) Original citation: Kim, Jae Kwang and Skinner, Chris J. (2013) Weighting

More information

The Inclusion Exclusion Principle

The Inclusion Exclusion Principle The Inclusion Exclusion Principle 1 / 29 Outline Basic Instances of The Inclusion Exclusion Principle The General Inclusion Exclusion Principle Counting Derangements Counting Functions Stirling Numbers

More information

OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS Outcome regressions and propensity scores

OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS Outcome regressions and propensity scores OUTCOME REGRESSION AND PROPENSITY SCORES (CHAPTER 15) BIOS 776 1 15 Outcome regressions and propensity scores Outcome Regression and Propensity Scores ( 15) Outline 15.1 Outcome regression 15.2 Propensity

More information

RESEARCH REPORT. Vanishing auxiliary variables in PPS sampling with applications in microscopy.

RESEARCH REPORT. Vanishing auxiliary variables in PPS sampling with applications in microscopy. CENTRE FOR STOCHASTIC GEOMETRY AND ADVANCED BIOIMAGING 2014 www.csgb.dk RESEARCH REPORT Ina Trolle Andersen, Ute Hahn and Eva B. Vedel Jensen Vanishing auxiliary variables in PPS sampling with applications

More information

Statistics for Engineers Lecture 9 Linear Regression

Statistics for Engineers Lecture 9 Linear Regression Statistics for Engineers Lecture 9 Linear Regression Chong Ma Department of Statistics University of South Carolina chongm@email.sc.edu April 17, 2017 Chong Ma (Statistics, USC) STAT 509 Spring 2017 April

More information

arxiv: v1 [stat.ap] 7 Aug 2007

arxiv: v1 [stat.ap] 7 Aug 2007 IMS Lecture Notes Monograph Series Complex Datasets and Inverse Problems: Tomography, Networks and Beyond Vol. 54 (007) 11 131 c Institute of Mathematical Statistics, 007 DOI: 10.114/07491707000000094

More information

Hypothesis Testing hypothesis testing approach

Hypothesis Testing hypothesis testing approach Hypothesis Testing In this case, we d be trying to form an inference about that neighborhood: Do people there shop more often those people who are members of the larger population To ascertain this, we

More information

Sampling. Jian Pei School of Computing Science Simon Fraser University

Sampling. Jian Pei School of Computing Science Simon Fraser University Sampling Jian Pei School of Computing Science Simon Fraser University jpei@cs.sfu.ca INTRODUCTION J. Pei: Sampling 2 What Is Sampling? Select some part of a population to observe estimate something about

More information

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1)

Summary of Chapter 7 (Sections ) and Chapter 8 (Section 8.1) Summary of Chapter 7 (Sections 7.2-7.5) and Chapter 8 (Section 8.1) Chapter 7. Tests of Statistical Hypotheses 7.2. Tests about One Mean (1) Test about One Mean Case 1: σ is known. Assume that X N(µ, σ

More information