Conditional restriction estimator for domains

Similar documents
Regression #3: Properties of OLS Estimator

Statistics II Lesson 1. Inference on one population. Year 2009/10

Mathematical statistics

Domain estimation under design-based models

Minimax design criterion for fractional factorial designs

A measurement error model approach to small area estimation

DISSERTATIONES MATHEMATICAE UNIVERSITATIS TARTUENSIS 54

Regression Estimation - Least Squares and Maximum Likelihood. Dr. Frank Wood

Small Area Modeling of County Estimates for Corn and Soybean Yields in the US

Advanced Survey Sampling

ELEG 5633 Detection and Estimation Minimum Variance Unbiased Estimators (MVUE)

Graduate Econometrics I: Unbiased Estimation

Chapters 9. Properties of Point Estimators

Parametric Bootstrap Methods for Bias Correction in Linear Mixed Models

Statistics and Econometrics I

Central Limit Theorem ( 5.3)

Chapter 8: Estimation 1

A note on multiple imputation for general purpose estimation

Parameter Estimation

Regression Estimation Least Squares and Maximum Likelihood

The outline for Unit 3

10-704: Information Processing and Learning Fall Lecture 24: Dec 7

MS&E 226: Small Data

Introduction to Simple Linear Regression

Non-parametric Inference and Resampling

Lecture 7 Introduction to Statistical Decision Theory

Maximum Likelihood Tests and Quasi-Maximum-Likelihood

MS&E 226: Small Data. Lecture 11: Maximum likelihood (v2) Ramesh Johari

Bias Variance Trade-off

Name: Matriculation Number: Tutorial Group: A B C D E

EIE6207: Estimation Theory

MINIMUM EXPECTED RISK PROBABILITY ESTIMATES FOR NONPARAMETRIC NEIGHBORHOOD CLASSIFIERS. Maya Gupta, Luca Cazzanti, and Santosh Srivastava

A General Overview of Parametric Estimation and Inference Techniques.

Lecture 2: Statistical Decision Theory (Part I)

Estimation of Parameters and Variance

Plausible Values for Latent Variables Using Mplus

Module 22: Bayesian Methods Lecture 9 A: Default prior selection

Estimation MLE-Pandemic data MLE-Financial crisis data Evaluating estimators. Estimation. September 24, STAT 151 Class 6 Slide 1

Lecture 8: Information Theory and Statistics

1 Lecture 6 Monday 01/29/01

Monte Carlo Study on the Successive Difference Replication Method for Non-Linear Statistics

ECE 275A Homework 6 Solutions

An Efficient Method to Estimate Labelled Sample Size for Transductive LDA(QDA/MDA) Based on Bayes Risk

Review. December 4 th, Review

An Efficient Estimation Method for Longitudinal Surveys with Monotone Missing Data

 ± σ A ± t A ˆB ± σ B ± t B. (1)

Information in a Two-Stage Adaptive Optimal Design

Indirect Sampling in Case of Asymmetrical Link Structures

arxiv: v2 [math.st] 20 Jun 2014

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

557: MATHEMATICAL STATISTICS II BIAS AND VARIANCE

ECE 275A Homework 7 Solutions

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Introduction to Estimation Methods for Time Series models. Lecture 1

ML and REML Variance Component Estimation. Copyright c 2012 Dan Nettleton (Iowa State University) Statistics / 58

F & B Approaches to a simple model

M(t) = 1 t. (1 t), 6 M (0) = 20 P (95. X i 110) i=1

Outline. Motivation Contest Sample. Estimator. Loss. Standard Error. Prior Pseudo-Data. Bayesian Estimator. Estimators. John Dodson.

Categorical Predictor Variables

MISCELLANEOUS TOPICS RELATED TO LIKELIHOOD. Copyright c 2012 (Iowa State University) Statistics / 30

On the bias of the multiple-imputation variance estimator in survey sampling

Applied Econometrics (QEM)

Non-parametric bootstrap mean squared error estimation for M-quantile estimates of small area means, quantiles and poverty indicators

To Estimate or Not to Estimate?

A Simulation Comparison Study for Estimating the Process Capability Index C pm with Asymmetric Tolerances

Maximum Likelihood Diffusive Source Localization Based on Binary Observations

Notes on the framework of Ando and Zhang (2005) 1 Beyond learning good functions: learning good spaces

Terminology Suppose we have N observations {x(n)} N 1. Estimators as Random Variables. {x(n)} N 1

An Interpretation of the Moore-Penrose. Generalized Inverse of a Singular Fisher Information Matrix

ANCOVA. ANCOVA allows the inclusion of a 3rd source of variation into the F-formula (called the covariate) and changes the F-formula

1. Fisher Information

Sanjay Chaudhuri Department of Statistics and Applied Probability, National University of Singapore

Estimation of Parameters

Chapter 11 MIVQUE of Variances and Covariances

Classical Estimation Topics

Maximum Likelihood Estimation

Using Estimating Equations for Spatially Correlated A

Statistical Inference Using Progressively Type-II Censored Data with Random Scheme

Sampling Theory. Improvement in Variance Estimation in Simple Random Sampling

Econometrics I KS. Module 2: Multivariate Linear Regression. Alexander Ahammer. This version: April 16, 2018

Mathematical statistics

Bootstrap Approach to Comparison of Alternative Methods of Parameter Estimation of a Simultaneous Equation Model

Higher order moments of the estimated tangency portfolio weights

Generalized Method of Moment

Improved General Class of Ratio Type Estimators

5.3 LINEARIZATION METHOD. Linearization Method for a Nonlinear Estimator

Model Assisted Survey Sampling

Modification and Improvement of Empirical Likelihood for Missing Response Problem

Regression - Modeling a response

Covariance function estimation in Gaussian process regression

6.1 Variational representation of f-divergences

EFFICIENT CALCULATION OF FISHER INFORMATION MATRIX: MONTE CARLO APPROACH USING PRIOR INFORMATION. Sonjoy Das

arxiv: v1 [math.st] 22 Dec 2018

Bias of the Maximum Likelihood Estimator of the Generalized Rayleigh Distribution

Orthogonal decompositions in growth curve models

Concentration Ellipsoids

ECE 636: Systems identification

Time-series small area estimation for unemployment based on a rotating panel survey

A Non-parametric bootstrap for multilevel models

Introduction to Regression Analysis. Dr. Devlina Chatterjee 11 th August, 2017

Transcription:

Conditional restriction estimator for domains Natalja Lepik, Kaja Sõstra and Imbi Traat University of Tartu, Estonia Statistics of Estonia 2

Survey sampling, sampling theory, finite population Often a functional relationship exists between population parameters quarterly totals have to sum up to the yearly total domain totals have to sum up to the population total totals of smaller domains have to sum up to the totals of larger domains 3

Survey estimates usually do not satisfy the relationships known for the parameters: estimators are random estimators from different surveys estimators from the same survey, but not additive (domains) different estimation methods for the parameters However, users usually want that survey estimates are consistent between themselves satisfy the same restrictions known for parameters. 4

In this talk General Restriction estimator (Knottnerus, 2003) Conditional Restriction estimator applied for domains some properties simulation study 5

General Restriction (GR) estimator is a new estimator constructed on the bases of initial estimators is a vector restrictions are satisfied covariance matrix is explicitly given known long ago discovered for survey sampling purposes 6

Let the parameter vector θ = (θ 1,..., θ k ) be initially unbiasedly estimated by ˆθ = (ˆθ 1,..., ˆθ k ) Let V = E(ˆθ θ)(ˆθ θ), nonsingular. Suppose, parameters have to satisfy restrictions Rθ = c, where R : r k matrix of rank r and c : r 1 vector of constants. For example R = (1, 1,..., 1), or R = ( 1, 1,..., 1) and c = 0 7

The GR-estimator ˆθ gr = ˆθ + K ( c Rˆθ ) K = VR ( RVR ) 1 V gr Cov (ˆθ ) gr = (I KR) V. Some simple properties, Rˆθ gr = c, ˆθ gr = ˆθ, if Rˆθ = c, Eˆθ gr = θ 8

Special features of ˆθ gr = ˆθ + K ( c Rˆθ ) K = VR ( RVR ) 1 due to the finite population sampling If V is not known, the ˆθ gr is not estimator. Replacing V by ˆV gives ˆθ gr = ˆθ + ˆVR ( R ˆVR ) 1 ( c Rˆθ ) 9

Expanding ˆθ gr into Taylor series at the point (θ, V) gives the following linear term: ˆθ gr ˆθ gr (θ,v) + [ dˆθ gr d ˆV ] (θ,v) vec ( ˆV V ) + [ dˆθ gr d ˆV ] (θ,v) vec (ˆθ θ ) = ˆθ + K ( c Rˆθ ) = ˆθ gr 10

Further properties The ˆθ gr is a linear minimum variance estimator of θ, given ˆθ and given the information that c Rθ = 0. Can be shown by using projection theory. The ˆθ gr from is more effective than ˆθ can be simply seen V V gr = KRV = VR (RVR ) 1 RV > 0, because matrix of type AA is positive definite if A is of full rank (true by assumptions here). 11

Further properties If the initial ˆθ is biased, E (ˆθ ) = θ+b, then the restriction estimator ˆθ gr still exists (satisfies restrictions), but biased: It can be shown that E (ˆθ gr ) = θ + (I KR)B. MSE (ˆθ gr ) = (I KR) MSE(ˆθ) attains its minimum for K = MSE(ˆθ) R ( R MSE(ˆθ) R ) 1. The bias of ˆθ gr satisfies the following restrictions R(I KR)B = 0 12

Further properties ˆθ gr = ˆθ + K ( c Rˆθ ) K = VR ( RVR ) 1 Cov (ˆθ ) gr = (I KR) V. Another matrix representation through covariance matrices K = Cov(ˆθ, Rˆθ) Cov 1 (Rˆθ) Cov(ˆθ gr ) = Cov(ˆθ, Rˆθ) Cov 1 (Rˆθ) Cov(ˆθ, Rˆθ) 13

Conditional restriction estimator In Statistical Agencies, after publishing main estimates a need occurs for additional estimates they should be consistent with the published ones the published ones can not be changed 14

Conditional restriction estimator find the restriction estimator so that the published numbers appear in restrictions as fixed constants the variance formula gives now the conditional variance (underestimates) find the unconditional variance! 15

Conditional restriction estimator for domain estimation We have initial estimators ˆΘ 1 and ˆΘ 2 for domain parameters We want to put restrictions without changing ˆΘ 1 Find ˆΘ gr 2 so that R ˆΘ gr 2 = ˆΘ 1 Corresponding GR-estimator is ˆΘ gr 2 = ˆΘ 2 + K( ˆΘ 1 R ˆΘ 2 ), where K = V 2 R (RV 2 R ) 1. What about Cov(ˆθ gr ) now? 16

Let V 1 = Cov( ˆΘ 1 ), V 2 = Cov( ˆΘ 2 ), V 21 = Cov( ˆΘ 2, ˆΘ 1 ) The unconditional covariance matrix is Cov( ˆΘ gr 2 ) = Cov(K ˆΘ 1 + P ˆΘ 2 ) = = PV 2 + KV 1 K + PV 21 K + (PV 21 K ), where P = I KR 17

Simulation study Population is based on Estonian LFS survey data. Population size: 2000 persons (1192 households) Number of domains D = 3 Target variables: monthly salary (thousand kroons) higher education (binary variable) 18

Table 1. Population characteristics Domain # of persons Total salary Total education 1 1 019 4 999 129 2 733 4 614 209 3 248 1 396 36 Total 2 000 11 010 374 19

Simulations 10,000 independent SI-samples were drawn from the population. GR-estimates of domains ˆθ gr 1, ˆθ gr 2, ˆθ gr 3 were calculated based on ratio estimators for domains ˆθ 1, ˆθ 2, ˆθ 3 using estimated population total for restriction ˆθ U using known covariance matrix 20

Consistency problem Figure 1. Difference between the sum of domain total estimates and estimated population total: ˆθ 1 + ˆθ 2 + ˆθ 3 ˆθ U 21

SI-design, binary variable Sample Parameter D1 D2 D3 Sum U 1 ˆθ 144 277 29 450 470 ˆθ GR 152 287 31 470 470 2 ˆθ 138 153 40 331 330 ˆθ GR 138 152 40 330 330 3 ˆθ 170 199 22 391 370 ˆθ GR 162 188 20 370 370............... Mean ˆθ 130 209 36 375 374 ˆθ GR 130 209 36 375 374 True 129 209 36 374 374 22

Simulation results (1) Figure 2. Empirical and theoretical variance of estimated domain total for binary variable: Cov( ˆΘ gr 2 ) = PV 2 + KV 1 K + PV 21 K + (PV 21 K ) 23

Simulation results (2) Figure 3. Empirical and theoretical variance of estimated domain total for continuous variable: Cov( ˆΘ gr 2 ) = PV 2 + KV 1 K + PV 21 K + (PV 21 K ) 24

References Knottnerus, P. (2003). Sample Survey Theory. Some Pythagorean Perspectives. New York: Springer Rao, C.R. (1965). Linear Statistical Inference and Its Applications. New York: Wiley 25

Thank You! 26