Variance estimation on SILC based indicators

Similar documents
Weekly price report on Pig carcass (Class S, E and R) and Piglet prices in the EU. Carcass Class S % + 0.3% % 98.

A Markov system analysis application on labour market dynamics: The case of Greece

Directorate C: National Accounts, Prices and Key Indicators Unit C.3: Statistics for administrative purposes

Assessment and Improvement of Methodologies used for GHG Projections

Land Use and Land cover statistics (LUCAS)

Sampling scheme for LUCAS 2015 J. Gallego (JRC) A. Palmieri (DG ESTAT) H. Ramos (DG ESTAT)

Trends in Human Development Index of European Union

THE USE OF CSISZÁR'S DIVERGENCE TO ASSESS DISSIMILARITIES OF INCOME DISTRIBUTIONS OF EU COUNTRIES

Annotated Exam of Statistics 6C - Prof. M. Romanazzi

Development of methodology for the estimate of variance of annual net changes for LFS-based indicators

AD HOC DRAFTING GROUP ON TRANSNATIONAL ORGANISED CRIME (PC-GR-COT) STATUS OF RATIFICATIONS BY COUNCIL OF EUROPE MEMBER STATES

Weighted Voting Games

Preparatory Signal Detection for the EU-27 Member States Under EU Burden Sharing Advanced Monitoring Including Uncertainty ( )

NASDAQ OMX Copenhagen A/S. 3 October Jyske Bank meets 9% Core Tier 1 ratio in EU capital exercise

F M U Total. Total registrants at 31/12/2014. Profession AS 2, ,574 BS 15,044 7, ,498 CH 9,471 3, ,932

Restoration efforts required for achieving the objectives of the Birds and Habitats Directives

APPLYING BORDA COUNT METHOD FOR DETERMINING THE BEST WEEE MANAGEMENT IN EUROPE. Maria-Loredana POPESCU 1

GIS Reference Layers on UWWT Directive Sensitive Areas. Technical Report. Version: 1.0. ETC/Water task:

Composition of capital NO051

Composition of capital CY007 CY007 POWSZECHNACY007 BANK OF CYPRUS PUBLIC CO LTD

Composition of capital ES060 ES060 POWSZECHNAES060 BANCO BILBAO VIZCAYA ARGENTARIA S.A. (BBVA)

Composition of capital DE025

Composition of capital LU045 LU045 POWSZECHNALU045 BANQUE ET CAISSE D'EPARGNE DE L'ETAT

Composition of capital CY006 CY006 POWSZECHNACY006 CYPRUS POPULAR BANK PUBLIC CO LTD

Composition of capital DE028 DE028 POWSZECHNADE028 DekaBank Deutsche Girozentrale, Frankfurt

Composition of capital FR015

Composition of capital FR013

Composition of capital DE017 DE017 POWSZECHNADE017 DEUTSCHE BANK AG

Composition of capital ES059

MB of. Cable. Wholesale. FWBA (fixed OAOs. connections of which Full unbundled. OAO owning. Internet. unbundled broadband

Part A: Salmonella prevalence estimates. (Question N EFSA-Q ) Adopted by The Task Force on 28 March 2007

"Transport statistics" MEETING OF THE WORKING GROUP ON RAIL TRANSPORT STATISTICS. Luxembourg, 25 and 26 June Bech Building.

WHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the WHO European Region

The trade dispute between the US and China Who wins? Who loses?

Composition of capital as of 30 September 2011 (CRD3 rules)

Composition of capital as of 30 September 2011 (CRD3 rules)

Composition of capital as of 30 September 2011 (CRD3 rules)

Composition of capital as of 30 September 2011 (CRD3 rules)

Composition of capital as of 30 September 2011 (CRD3 rules)

Composition of capital as of 30 September 2011 (CRD3 rules)

Overview of numbers submitted for Statistics on Pending Mutual Agreement Procedures (MAPs) under the Arbitration Convention (AC) at the End of 2017

WHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the European Region

Statistics on Pending Mutual Agreement Procedures (MAPs) under the Arbitration Convention at the End of 2015

WHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the European Region

WHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the WHO European Region

The European height reference system and its realizations

Country

Land Cover and Land Use Diversity Indicators in LUCAS 2009 data

EU JOINT TRANSFER PRICING FORUM

The EuCheMS Division Chemistry and the Environment EuCheMS/DCE

ECOSTAT nutrient meeting ( ) Session 1: Comparison of European freshwater and saline water nutrient boundaries

Gravity Analysis of Regional Economic Interdependence: In case of Japan

Statistics on Pending Mutual Agreement Procedures (MAPs) under the Arbitration Convention at the End of 2014

Modelling structural change using broken sticks

WHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the European Region

Resource efficiency and Geospatial data What EUROSTAT does. What could do.

WHO EpiData. A monthly summary of the epidemiological data on selected vaccine preventable diseases in the European Region

WHO EpiData. A monthly summary of the epidemiological data on selected Vaccine preventable diseases in the European Region

This document is a preview generated by EVS

COMMISSION OF THE EUROPEAN COMMUNITIES COMMISSION STAFF WORKING DOCUMENT. Annexes accompanying the

Public data underlying the figures of Annual Report on the Results of Monitoring the Internal Electricity and Natural Gas Markets in 2015

Estimation of Design Effects for ESS Round II

Part A: Salmonella prevalence estimates. (Question N EFSA-Q A) Adopted by The Task Force on 28 April 2008

This document is a preview generated by EVS

RESOLUTION FOR THE THIRD PARTY PROGRAMME ON EUMETSAT ACTIVITIES IN SUPPORT OF COPERNICUS IN THE PERIOD

Bathing water results 2011 Slovakia

Measuring Instruments Directive (MID) MID/EN14154 Short Overview

Bathing water results 2011 Latvia

Preliminary Report. Analysis of the baseline study on the prevalence of Salmonella in laying hen flocks of Gallus gallus

Economic and Social Council

United Nations Environment Programme

PLUTO The Transport Response to the National Planning Framework. Dr. Aoife O Grady Department of Transport, Tourism and Sport

Country

Bilateral Labour Agreements, 2004

RISK ASSESSMENT METHODOLOGIES FOR LANDSLIDES

EuroGeoSurveys An Introduction

Use of the ISO Quality standards at the NMCAs Results from questionnaires taken in 2004 and 2011

Regional economy upgrading triple helix at work? Some selected cases from the Czech republic (and Central Eastern Europe) Pavel Ptáček

40 Years Listening to the Beat of the Earth

The European regional Human Development and Human Poverty Indices Human Development Index

EuroGeoSurveys & ASGMI The Geological Surveys of Europe and IberoAmerica

United Nations Environment Programme

European Apple Crop Outlook 2017 a review of 2016 season and outlook Philippe Binard World Apple and Pear Association (WAPA)

Salmonella monitoring data and foodborne outbreaks for 2015 in the European Union

This document is a preview generated by EVS

Fewer. Bigger. Better?

EUMETSAT. A global operational satellite agency at the heart of Europe. Presentation for the Spanish Industry Day Madrid, 15 March 2012

Identification of Very Shallow Groundwater Regions in the EU to Support Monitoring

The School Geography Curriculum in European Geography Education. Similarities and differences in the United Europe.

STATISTICA MULTIVARIATA 2

Publication Date: 15 Jan 2015 Effective Date: 12 Jan 2015 Addendum 6 to the CRI Technical Report (Version: 2014, Update 1)

Summary of data. on the progress made in financing and implementing financial engineering instruments

This document is a preview generated by EVS

EUROINDICATORS WORKING GROUP THE IMPACT OF THE SEASONAL ADJUSTMENT PROCESS OF BUSINESS TENDENCY SURVEYS ON TURNING POINTS DATING

STATEMENT ON EBA CAPITAL EXERCISE

Summary report on the progress made in financing and implementing financial engineering instruments co-financed by Structural Funds

THE EFFECTS OF WEATHER CHANGES ON NATURAL GAS CONSUMPTION

1. Demand for property on the coast

Populating urban data bases with local data

GIS Reference Layers on UWWT Directive Sensitive Areas Description of dataset and processing

Transcription:

Variance estimation on SILC based indicators Emilio Di Meglio Eurostat emilio.di-meglio@ec.europa.eu Guillaume Osier STATEC guillaume.osier@statec.etat.lu 3rd EU-LFS/EU-SILC European User Conference 1

Our main message today EU SILC is a sample survey This means that when looking at indicators we should also take a look at accuracy measures EU SILC is a complex survey Naive methods are not directly applicable We present the first results of variance estimation using linearization techniques 3rd EU-LFS/EU-SILC European User Conference 2

Why variance estimation? Requested by regulation Quality report Compliance Requested by users Policy relevance of indicators Requested by researchers 3rd EU-LFS/EU-SILC European User Conference 3

Current quality precision requirements According to Reg.1982/2003, the X and L (initial sample) data are to be based on a nationally representative probability sample of the population residing in private households Representative probability samples shall be achieved both for households and for individual persons in the target population. The sampling frame and methods of sample selection should ensure that every individual and household in the target population is assigned a known and non-zero probability of selection. Reg. 1177/2003 defines the minimum effective sample sizes to be achieved. 3rd EU-LFS/EU-SILC European User Conference 4

Minimum effective sample size Households Persons aged 16+ Country Cross-sectional Longitudinal Cross-sectional Longitudinal BE 4 750 3 500 8 750 6 500 BG 4 500 3 500 10 000 7 500 DK 4 250 3 250 7 250 5 550 DE 8 250 6 000 14 500 10 500 3rd EU-LFS/EU-SILC European User Conference 5

DEFF (Kish, 1965) Definition: "The ratio of the variance under the given sample design, to the variance under a simple random sample of the same size" Importance: a tool to measure the efficiency of your complex sample design Calculation based on at-risk-of poverty rate 3rd EU-LFS/EU-SILC European User Conference 6

What influences variance? Variability of the phenomenon Sample size Indicator value Sampling design Imputation Calibration Weighting 3rd EU-LFS/EU-SILC European User Conference 7

Main challenges for EU SILC Difficulty to find the «best» possible method for variance estimation Different designs (flexibility) Missing information Debate on methods ongoing Differentiate the needs: accuracy estimates for policy usage and accuracy estimates for researchers. 3rd EU-LFS/EU-SILC European User Conference 8

Sampling design by country (2010) Sampling of dwellings/ addresses Sampling of households Sampling of individuals Simple random sampling Stratified simple random sampling Stratified simple random sampling from former participants of micro census Stratified multi-stage sampling Stratified simple random sampling Stratified multi-stage sampling Simple random or systematic sampling Stratified simple random or systematic sampling Stratified two-phase sampling Stratified two-stage sampling Malta Luxembourg, Austria* Germany Czech Republic, Spain, France, Hungary, Latvia, The Netherlands, Poland, Portugal, Romania, United Kingdom Cyprus, Slovakia Belgium, Bulgaria, Greece, Ireland, Italy Denmark, Iceland, Sweden, Norway Estonia, Lithuania Finland Slovenia 3rd EU-LFS/EU-SILC European User Conference 9

Sample design variables DB050: primary strata DB060: primary sampling units DB062: secondary sampling units DB070: order of selection of primary sampling units DB030: household ID 3rd EU-LFS/EU-SILC European User Conference 10

Our objective Resampling taking into account all the possible elements coming from 32 countries would be extremely computationally and resource intensive Variance estimation methods balancing between scientific accuracy and administrative considerations (time, cost, simplicity) are the only viable solution Aim: to quickly provide to users and policy makers standard errors for the SILC-based indicators, particularly the AROPE, its components and its main breakdowns. 3rd EU-LFS/EU-SILC European User Conference 11

The proposed approach We have considered different methods: bootstrap, Jacknife, linearisation We carried out comparative experiments on a limited number of countries and results are similar We chose to work with linearisation (ultimate cluster approach proposed by Net-SILC2) that can provide acceptable results given the constraints we face The approach was discussed at the Workshop accuracy (Net-SILC2) and validated by the SILC WG 3rd EU-LFS/EU-SILC European User Conference 12

The method (synthesis) Linearization is a technique based on the use of linear approximation to reduce non-linear statistics to a linear form, justified by asymptotic properties of the estimator (Särndal et al, 1992 ; Deville, 1999 ; Wolter, 2006 ; Osier, 2009) The "ultimate cluster" approach (Särndal et al, 1992) is a simplification consisting in calculating the variance taking into account only variation among Primary Sampling Unit (PSU) totals This method requires first stage sampling fractions to be small which is nearly always the case. This method allows a great flexibility and simplifies the calculations of variances. It can also be generalized to calculate variance of the differences of one year to another (Berger, 2004, 2010 ). Applicable with the main statistical packages (SAS, R, STATA) 3rd EU-LFS/EU-SILC European User Conference 13

The method and the results The described methodology has been applied in Eurostat for running an estimation of variance and confidence intervals For AROPE and its subcomponents for 3 age groups (0-17, 18-64, 65+) and gender breakdowns For estimating variance of net changes. We have used the SAS procedures SURVEYMEANS and SURVEYFREQ that allow to specify the survey design According to the characteristics and availability of data for different countries we have used different variables to specify strata and cluster information. 3rd EU-LFS/EU-SILC European User Conference 14

The method and the results We have used SAS PROC Surveyfreq (linearization) adapting strata and cluster parameters according to the following groups: GROUP 1: BE, BG, CZ, IE, EL, ES, FR, IT, LV, HU, NL, PL, PT, RO, SI, UK, HR Strata=DB050 Cluster=DB060 GROUP 2 DE, EE, CY, LT, LU, AT, SK, FI, CH Strata=DB050 cluster=db030 GROUP 3 DK IS MT NO SE Cluster=DB030 3rd EU-LFS/EU-SILC European User Conference 15

Results on AROPE For 6 countries 95% Confidence Interval for AROPE equal or smaller that ±1.0% (CZ, IT, SI, DE, FI, SE) For 11 countries 95% Confidence Interval for AROPE between ± 1% and ±1.5% (ES, HU, PL, UK, EE, AT, SK, CH, DK, IS, NO) For 8 countries 95% Confidence Interval for AROPE between ±1.5% and ±2% (BE, BG, EL, LV, NL, PT, CY, MT) For 4 countries 95% Confidence Interval for AROPE larger than ±2% (IE, RO, LT, HR) Complete results in EU-SILC quality report 3rd EU-LFS/EU-SILC European User Conference 16

Results, example Member State Indicator Value Standards Error (%) CI 95% Lower bound CI 95% Upper bound EU27 16.4 0.14 16.08 16.64 BE 14.6 0.74 13.13 16.06 BG 20.7 0.85 19.03 22.35 CZ 9.0 0.44 8.14 9.86 IE 16.1 0.98 14.13 17.98 3rd EU-LFS/EU-SILC European User Conference 17

Measurement of net changes To measure the significance of the evolution of social indicators Example: When the At-risk-of-poverty or social exclusion rate for Cyprus goes from 22.9% in 2010 to 23.5% in 2011, are we able to say that this change is significant? Exercise already done for: AROPE, AROPE(0-17),ARP,ARP(65+), SMD, VLWI, IWP, UMNC 3rd EU-LFS/EU-SILC European User Conference 18

Problem statement t= x/y Absolute change: = t2-t1 AIM: Estimation of variance of change Major problem: temporal correlations between indicators Var ( )= var(t1)+var(t2)-2corr(t1,t2)sqr(var(t1)var(t2)) NET-SILC2: multivariate linear regression approach (Berger and Priam, Statistics Canada Symposium, 2010 code SAS developed by G. Osier) 3rd EU-LFS/EU-SILC European User Conference 19

Algorithm used 1. Preparation of the data 2. Aggregation at PSU level (ultimate cluster PSU approach) 3. SE estimation for the X estimator at T0 4. 1-3 for the X estimator at T1 5. SE estimation for changes in X estimators using the multivariate regression approach: Responses variables: 4 totals Regressors: 1. Stratification dummy variables 2. Rotation variable at T0 (dummy variable which specifies which PSUs are observed at T0) 3. Rotation variable at T1 3rd EU-LFS/EU-SILC European User Conference 20

Output Country AROPE (2010) % AROPE (2011) % Difference 2011-2010 (% points) Standard error (% points) Margin of error (% points) = 1.96*SE Significance of change BE 20.8 21 0.1 0.076 0.1 N BG 41.6 49.1 7.5 0.726 1.4 Y CY 22.9 23.5 0.5 0.605 1.2 N DK 18.3 18.9 0.5 0.448 0.9 N 3rd EU-LFS/EU-SILC European User Conference 21

Conclusion and future plans The methodology is of relatively simple application It can be considered as a good compromise between scientific soundness and feasibility under current constraints. SILC based indicators in the current implementation can be considered as having an overall acceptable accuracy; The next steps consist in still improving these calculations by asking Member States to provide the necessary information where missing. Dissemination of further information to users under investigation. 3rd EU-LFS/EU-SILC European User Conference 22