Small Area Estimation in R with Application to Mexican Income Data

Size: px
Start display at page:

Download "Small Area Estimation in R with Application to Mexican Income Data"

Transcription

1 with Application to Mexican Income Data Ann-Kristin Kreutzmann 1, Sören Pannier 1 Natalia Rojas-Perilla 1, Timo Schmid 1 Matthias Templ 2 Nikos Tzavidis 3 1 Freie Universität Berlin 2 Zürcher Hochschule für Angewandte Wissenschaften 3 University of Southampton New Techniques and Technologies for Statistics (NTTS) March 16, 2017 Ann-Kristin Kreutzmann 1 (18) - NTTS 2017

2 Why Using Small Area Estimation Population of interest (or target population): population for which the survey is designed, here the State of Mexico (EDOMEX) Direct estimators should be reliable for the target population Gini - Direct Figure: Gini coefficient for EDOMEX using direct estimation. Ann-Kristin Kreutzmann 2 (18) - NTTS 2017

3 Introduction to Small Area Estimation Why Using Small Area Estimation Population of interest: Sub-population of the population (domain), planned or not in the survey design,here municipalities in EDOMEX, Direct estimators may be unreliable due to small sample sizes Gini - Direct Gini - SAE High High Low Low Figure: Gini coefficients for municipalities in EDOMEX using direct estimation (left) and a Small Area Estimation (SAE) method (right). Municipalities filled with grey color represent areas with zero sample size. Ann-Kristin Kreutzmann 3 (18) - NTTS 2017

4 Combining Different Data Sources In order to provide reliable estimates in all subdomains, efficient ways of combining information are required Main idea: 1. Usage of observed/collected data to fit a suitable model Y X 2. Produce predictions in all the domains using available covariates Types of covariates: - Aggregate data provide covariates for every domain - Individual data provide covariates for every individual in every domain Ann-Kristin Kreutzmann 4 (18) - NTTS 2017

5 R Packages for SAE CRAN Task View: Official Statistics & Survey Methodology nlme and lme4 for mixed-effects models hbsae for basic area- and unit-level models fitted by restricted maximum likelihood or hierarchical Bayes rsae for robust basic unit- and area-level models JoSAE for unit-level models and the generalized regression estimator (main purpose: documentation of function used in publications) Ann-Kristin Kreutzmann 5 (18) - NTTS 2017

6 R Packages for SAE Other packages BayesSAE for area-level models in Bayesian context saerobust for robust area level models saery and sae2 for area-level models with time effects sae for a wide variety of SAE methods including area-level and unit-level models for the mean as well as models for the estimation of non-linear parameters emdi for the estimation and visualization of non-linear indicators Ann-Kristin Kreutzmann 6 (18) - NTTS 2017

7 Gini coefficients for municipalities in EDOMEX How does a statistical institute receive the map of Gini coefficients for the municipalities in EDOMEX? We assume that survey and census data is available for EDOMEX SAE methods that enable the estimation of non-linear indicators as the Gini coefficient are implemented in the sae package and in package emdi Since package emdi supports the whole process from the estimation over model evaluation and visualization functions from this package are used for the following example Ann-Kristin Kreutzmann 7 (18) - NTTS 2017

8 Estimation The method that is used to receive indicators for the municipalities is the Empirical Best Prediction (EBP) approach by Molina and Rao (2010). 1 modelfit <- ebp ( fixed = ictpc ~ pcocup + jnived + 2 clase _ hog + pcpering + bienes + 3 actcom, 4 pop _ data = census, pop _ domains = " domain _id", 5 smp _ data = survey, smp _ domains = " domain _id", 6 transformation = " box. cox ", 7 MSE = TRUE, 8 custom _ indicator = 9 list (my_max = function (y, pov _ line ){ max (y)}, 10 my_min = function (y, pov _ line ){ min (y)}), 11 na. rm = TRUE ) Ann-Kristin Kreutzmann 8 (18) - NTTS 2017

9 Data and Model Diagnostics (1) 1 > summary(modelfit) 2 Empirical Best Prediction 3 4 Out -of - sample domains : 67 5 In - sample domains : Sample sizes : 8 Units in sample : Units in population : Min. 1 stqu. Median Mean 3 rdqu. Max. 12 Sample _ domains Population _ domains Ann-Kristin Kreutzmann 9 (18) - NTTS 2017

10 Data and Model Diagnostics (2) 1 Explanatory measures : 2 Marginal _ R2 Conditional _ R Residual diagnostics : 6 Skewness Kurtosis Shapiro _ W Shapiro _ p 7 Error e Random _ effect e ICC : Transformation : 13 Transformation Method Optimal _ lambda Shift _ parameter 14 box. cox reml Ann-Kristin Kreutzmann 10 (18) - NTTS 2017

11 Graphical Diagnostics 1 > plot(modelfit) Error term Random effect Quantiles of pearson residuals Theoretical quantiles Quantiles of random effects Theoretical quantiles Density Density Pearson residuals Pearson residuals Density Density Standardized random effects Standardized random effects Cook's Distance Plot Box Cox REML Cook's Distance Log Likelihood Index λ Ann-Kristin Kreutzmann 11 (18) - NTTS 2017

12 Selection of Indicators Function estimators helps to select the indicators the user is interested in Additionally, the coefficient of variation can be received The user can choose single indicators or groups of indicators 1 > head(estimators(object = modelfit, indicator = "Gini", CV = TRUE)) 2 Domain Gini Gini _ CV 3 1 Acambay Acolman Aculco Almoloya de Alquisiras Almoloya de Juárez Almoloya del Río Ann-Kristin Kreutzmann 12 (18) - NTTS 2017

13 Introduction to Small Area Estimation Visualization 1 > map table <- data.frame(domain = unique(census$domain id), 2 mun = sort(shp mex$mun)) 3 4 > map plot(object = modelfit, CV = TRUE, map obj = shp mex, 5 indicator = "Gini", map dom id = "mun", map tab = map table) Gini Ann-Kristin Kreutzmann Gini CV High High Low Low 13 (18) - NTTS 2017

14 Export to excel 1 > write.excel(modelfit, file ="excel output.xlsx", indicator = "Gini", CV = TRUE) Ann-Kristin Kreutzmann 14 (18) - NTTS 2017

15 Official statistics are interested in disaggregated indicators SAE methods enable the estimation of disaggregated indicators and several packages in R provide these methods Non-linear indicators like the Gini coefficient or the At-risk-of-poverty rate are of special interest Package sae and package emdi enable the estimation of these indicators As shown, package emdi provides an overall package for the user from the estimation to the visualization and export of results Ann-Kristin Kreutzmann 15 (18) - NTTS 2017

16 References Bates, D., Maechler, M., Bolker, B. and Steve Walker (2015). Fitting Linear Mixed-Effects. Models Using lme4. Journal of Statistical Software, 67(1), doi: /jss.v067.i01. Boonstra, H.J. (2012). hbsae: Hierarchical Bayesian Small Area Estimation. R package version 1.0., URL: Breidenbach, J. (2015). JoSAE: Functions for some Unit-Level Small Area Estimators and their Variances. R package version , URL: Chengchun Shi and with contributions from Peng Zhang (2013). BayesSAE: Bayesian Analysis of Small Area Estimation. R package version , URL: Ann-Kristin Kreutzmann 16 (18) - NTTS 2017

17 References Esteban Lefler, M. D., Morales Gonzalez, D. and Perez Martin, A. (2014). saery: Small Area Estimation for Rao and Yu Model. R package version 1.0., URL: Fay, R.E. and Diallo, M. (2015). sae2: Small Area Estimation: Time-series Models. R package version , URL: Kreutzmann, A., Rojas-Perilla, N., Schmid, T., Templ, M. and Tzavidis, N. (2017). emdi: Estimating and Mapping Disaggregated Indicators. R package version Molina, I. and Marhuenda, Y. (2015). sae: An R Package for Small Area Estimation. The R Journal 7(1), Ann-Kristin Kreutzmann 17 (18) - NTTS 2017

18 References Molina, I. and Rao, J.N.K. (2010). Small area estimation of poverty indicators. The Canadian Journal of Statistics 38(3), Pinheiro, J., Bates, D., DebRoy, S., Sarkar, D. and R Core Team (2016). nlme: Linear and Nonlinear Mixed Effects Models. R package version , URL: Rao, J.N.K. & Molina, I. (2015), Small Area Estimation. John Wiley & Sons Schoch, T. (2014). rsae: Robust Small Area Estimation. R package version Warnholz, S. (2016). saerobust: Robust Small Area Estimation. R package version , URL: Ann-Kristin Kreutzmann 18 (18) - NTTS 2017

Poverty Estimation Methods: a Comparison under Box-Cox Type Transformations with Application to Mexican Data

Poverty Estimation Methods: a Comparison under Box-Cox Type Transformations with Application to Mexican Data Thesis submitted in fulfillment for the degree of Master of Science in Statistics to the topic Poverty Estimation Methods: a Comparison under Box-Cox Type Transformations with Application to Mexican Data

More information

Small Area Estimates of Poverty Incidence in the State of Uttar Pradesh in India

Small Area Estimates of Poverty Incidence in the State of Uttar Pradesh in India Small Area Estimates of Poverty Incidence in the State of Uttar Pradesh in India Hukum Chandra Indian Agricultural Statistics Research Institute, New Delhi Email: hchandra@iasri.res.in Acknowledgments

More information

Selection of small area estimation method for Poverty Mapping: A Conceptual Framework

Selection of small area estimation method for Poverty Mapping: A Conceptual Framework Selection of small area estimation method for Poverty Mapping: A Conceptual Framework Sumonkanti Das National Institute for Applied Statistics Research Australia University of Wollongong The First Asian

More information

Domain Estimation of Survey Discontinuities

Domain Estimation of Survey Discontinuities Domain Estimation of Survey Discontinuities Nikos Tzavidis 1 Joint work with Paul Smith (University of Southampton), Timo Schmid, Natalia Rojas-Perilla, (Freie Universität Berlin), Jan van den Brackel

More information

Linear, Generalized Linear, and Mixed-Effects Models in R. Linear and Generalized Linear Models in R Topics

Linear, Generalized Linear, and Mixed-Effects Models in R. Linear and Generalized Linear Models in R Topics Linear, Generalized Linear, and Mixed-Effects Models in R John Fox McMaster University ICPSR 2018 John Fox (McMaster University) Statistical Models in R ICPSR 2018 1 / 19 Linear and Generalized Linear

More information

Non-parametric bootstrap mean squared error estimation for M-quantile estimates of small area means, quantiles and poverty indicators

Non-parametric bootstrap mean squared error estimation for M-quantile estimates of small area means, quantiles and poverty indicators Non-parametric bootstrap mean squared error estimation for M-quantile estimates of small area means, quantiles and poverty indicators Stefano Marchetti 1 Nikos Tzavidis 2 Monica Pratesi 3 1,3 Department

More information

ESTP course on Small Area Estimation

ESTP course on Small Area Estimation ESTP course on Small Area Estimation Statistics Finland, Helsinki, 29 September 2 October 2014 Topic 1: Introduction to small area estimation Risto Lehtonen, University of Helsinki Lecture topics: Monday

More information

The First Thing You Ever Do When Receive a Set of Data Is

The First Thing You Ever Do When Receive a Set of Data Is The First Thing You Ever Do When Receive a Set of Data Is Understand the goal of the study What are the objectives of the study? What would the person like to see from the data? Understand the methodology

More information

Fitting a Bayesian Fay-Herriot Model

Fitting a Bayesian Fay-Herriot Model Fitting a Bayesian Fay-Herriot Model Nathan B. Cruze United States Department of Agriculture National Agricultural Statistics Service (NASS) Research and Development Division Washington, DC October 25,

More information

Small Area Modeling of County Estimates for Corn and Soybean Yields in the US

Small Area Modeling of County Estimates for Corn and Soybean Yields in the US Small Area Modeling of County Estimates for Corn and Soybean Yields in the US Matt Williams National Agricultural Statistics Service United States Department of Agriculture Matt.Williams@nass.usda.gov

More information

Diagnostics for mixed/hierarchical linear models

Diagnostics for mixed/hierarchical linear models Graduate Theses and Dissertations Graduate College 2013 Diagnostics for mixed/hierarchical linear models Adam Madison Montgomery Loy Iowa State University Follow this and additional works at: http://lib.dr.iastate.edu/etd

More information

Subject CS1 Actuarial Statistics 1 Core Principles

Subject CS1 Actuarial Statistics 1 Core Principles Institute of Actuaries of India Subject CS1 Actuarial Statistics 1 Core Principles For 2019 Examinations Aim The aim of the Actuarial Statistics 1 subject is to provide a grounding in mathematical and

More information

Small Domains Estimation and Poverty Indicators. Carleton University, Ottawa, Canada

Small Domains Estimation and Poverty Indicators. Carleton University, Ottawa, Canada Small Domains Estimation and Poverty Indicators J. N. K. Rao Carleton University, Ottawa, Canada Invited paper for presentation at the International Seminar Population Estimates and Projections: Methodologies,

More information

Estimation of Complex Small Area Parameters with Application to Poverty Indicators

Estimation of Complex Small Area Parameters with Application to Poverty Indicators 1 Estimation of Complex Small Area Parameters with Application to Poverty Indicators J.N.K. Rao School of Mathematics and Statistics, Carleton University (Joint work with Isabel Molina from Universidad

More information

Multivariate area level models for small area estimation. a

Multivariate area level models for small area estimation. a Multivariate area level models for small area estimation. a a In collaboration with Roberto Benavent Domingo Morales González d.morales@umh.es Universidad Miguel Hernández de Elche Multivariate area level

More information

Small Domain Estimation for a Brazilian Service Sector Survey

Small Domain Estimation for a Brazilian Service Sector Survey Proceedings 59th ISI World Statistics Congress, 5-30 August 013, Hong Kong (Session CPS003) p.334 Small Domain Estimation for a Brazilian Service Sector Survey André Neves 1, Denise Silva and Solange Correa

More information

Model-based Estimation of Poverty Indicators for Small Areas: Overview. J. N. K. Rao Carleton University, Ottawa, Canada

Model-based Estimation of Poverty Indicators for Small Areas: Overview. J. N. K. Rao Carleton University, Ottawa, Canada Model-based Estimation of Poverty Indicators for Small Areas: Overview J. N. K. Rao Carleton University, Ottawa, Canada Isabel Molina Universidad Carlos III de Madrid, Spain Paper presented at The First

More information

Analysis of means: Examples using package ANOM

Analysis of means: Examples using package ANOM Analysis of means: Examples using package ANOM Philip Pallmann February 15, 2016 Contents 1 Introduction 1 2 ANOM in a two-way layout 2 3 ANOM with (overdispersed) count data 4 4 ANOM with linear mixed-effects

More information

Non-Parametric Bootstrap Mean. Squared Error Estimation For M- Quantile Estimators Of Small Area. Averages, Quantiles And Poverty

Non-Parametric Bootstrap Mean. Squared Error Estimation For M- Quantile Estimators Of Small Area. Averages, Quantiles And Poverty Working Paper M11/02 Methodology Non-Parametric Bootstrap Mean Squared Error Estimation For M- Quantile Estimators Of Small Area Averages, Quantiles And Poverty Indicators Stefano Marchetti, Nikos Tzavidis,

More information

Contextual Effects in Modeling for Small Domains

Contextual Effects in Modeling for Small Domains University of Wollongong Research Online Applied Statistics Education and Research Collaboration (ASEARC) - Conference Papers Faculty of Engineering and Information Sciences 2011 Contextual Effects in

More information

Bayesian SAE using Complex Survey Data Lecture 7A: SAE

Bayesian SAE using Complex Survey Data Lecture 7A: SAE Bayesian SAE using Complex Survey Data Lecture 7A: SAE Jon Wakefield Departments of Statistics and Biostatistics University of Washington 1 / 37 Outline Motivation Weights Modeling for Survey Data Technical

More information

Package sae. R topics documented: July 8, 2015

Package sae. R topics documented: July 8, 2015 Type Package Title Small Area Estimation Version 1.1 Date 2015-08-07 Author Isabel Molina, Yolanda Marhuenda Package sae July 8, 2015 Maintainer Yolanda Marhuenda Depends stats, nlme,

More information

Hierarchical Linear Models (HLM) Using R Package nlme. Interpretation. 2 = ( x 2) u 0j. e ij

Hierarchical Linear Models (HLM) Using R Package nlme. Interpretation. 2 = ( x 2) u 0j. e ij Hierarchical Linear Models (HLM) Using R Package nlme Interpretation I. The Null Model Level 1 (student level) model is mathach ij = β 0j + e ij Level 2 (school level) model is β 0j = γ 00 + u 0j Combined

More information

Aedes egg laying behavior Erika Mudrak, CSCU November 7, 2018

Aedes egg laying behavior Erika Mudrak, CSCU November 7, 2018 Aedes egg laying behavior Erika Mudrak, CSCU November 7, 2018 Introduction The current study investivates whether the mosquito species Aedes albopictus preferentially lays it s eggs in water in containers

More information

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1

Contents. Preface to Second Edition Preface to First Edition Abbreviations PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1 Contents Preface to Second Edition Preface to First Edition Abbreviations xv xvii xix PART I PRINCIPLES OF STATISTICAL THINKING AND ANALYSIS 1 1 The Role of Statistical Methods in Modern Industry and Services

More information

Package blme. August 29, 2016

Package blme. August 29, 2016 Version 1.0-4 Date 2015-06-13 Title Bayesian Linear Mixed-Effects Models Author Vincent Dorie Maintainer Vincent Dorie Package blme August 29, 2016 Description Maximum a posteriori

More information

Math 1710 Class 20. V2u. Last Time. Graphs and Association. Correlation. Regression. Association, Correlation, Regression Dr. Back. Oct.

Math 1710 Class 20. V2u. Last Time. Graphs and Association. Correlation. Regression. Association, Correlation, Regression Dr. Back. Oct. ,, Dr. Back Oct. 14, 2009 Son s Heights from Their Fathers Galton s Original 1886 Data If you know a father s height, what can you say about his son s? Son s Heights from Their Fathers Galton s Original

More information

On Modifications to Linking Variance Estimators in the Fay-Herriot Model that Induce Robustness

On Modifications to Linking Variance Estimators in the Fay-Herriot Model that Induce Robustness Statistics and Applications {ISSN 2452-7395 (online)} Volume 16 No. 1, 2018 (New Series), pp 289-303 On Modifications to Linking Variance Estimators in the Fay-Herriot Model that Induce Robustness Snigdhansu

More information

ESTIMATION OF SMALL AREA CHARACTERISTICS USING MULTIVARIATE RAO-YU MODEL

ESTIMATION OF SMALL AREA CHARACTERISTICS USING MULTIVARIATE RAO-YU MODEL STATISTICS IN TRANSITION new series, December 2017 725 STATISTICS IN TRANSITION new series, December 2017 Vol. 18, No. 4, pp. 725 742, DOI 10.21307/stattrans-2017-009 ESTIMATION OF SMALL AREA CHARACTERISTICS

More information

Multilevel Statistical Models: 3 rd edition, 2003 Contents

Multilevel Statistical Models: 3 rd edition, 2003 Contents Multilevel Statistical Models: 3 rd edition, 2003 Contents Preface Acknowledgements Notation Two and three level models. A general classification notation and diagram Glossary Chapter 1 An introduction

More information

MIXED DATA GENERATOR

MIXED DATA GENERATOR MIXED DATA GENERATOR Martin Matějka Jiří Procházka Zdeněk Šulc Abstract Very frequently, simulated data are required for quality evaluation of newly developed coefficients. In some cases, datasets with

More information

STATISTICAL COMPUTING USING R/S. John Fox McMaster University

STATISTICAL COMPUTING USING R/S. John Fox McMaster University STATISTICAL COMPUTING USING R/S John Fox McMaster University The S statistical programming language and computing environment has become the defacto standard among statisticians and has made substantial

More information

Community Health Needs Assessment through Spatial Regression Modeling

Community Health Needs Assessment through Spatial Regression Modeling Community Health Needs Assessment through Spatial Regression Modeling Glen D. Johnson, PhD CUNY School of Public Health glen.johnson@lehman.cuny.edu Objectives: Assess community needs with respect to particular

More information

Residuals and regression diagnostics: focusing on logistic regression

Residuals and regression diagnostics: focusing on logistic regression Big-data Clinical Trial Column Page of 8 Residuals and regression diagnostics: focusing on logistic regression Zhongheng Zhang Department of Critical Care Medicine, Jinhua Municipal Central Hospital, Jinhua

More information

Manual: R package HTSmix

Manual: R package HTSmix Manual: R package HTSmix Olga Vitek and Danni Yu May 2, 2011 1 Overview High-throughput screens (HTS) measure phenotypes of thousands of biological samples under various conditions. The phenotypes are

More information

The Model Building Process Part I: Checking Model Assumptions Best Practice

The Model Building Process Part I: Checking Model Assumptions Best Practice The Model Building Process Part I: Checking Model Assumptions Best Practice Authored by: Sarah Burke, PhD 31 July 2017 The goal of the STAT T&E COE is to assist in developing rigorous, defensible test

More information

Multivariate Lineare Modelle

Multivariate Lineare Modelle 0-1 TALEB AHMAD CASE - Center for Applied Statistics and Economics Humboldt-Universität zu Berlin Motivation 1-1 Motivation Multivariate regression models can accommodate many explanatory which simultaneously

More information

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1)

The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) The Model Building Process Part I: Checking Model Assumptions Best Practice (Version 1.1) Authored by: Sarah Burke, PhD Version 1: 31 July 2017 Version 1.1: 24 October 2017 The goal of the STAT T&E COE

More information

ON THE CONSEQUENCES OF MISSPECIFING ASSUMPTIONS CONCERNING RESIDUALS DISTRIBUTION IN A REPEATED MEASURES AND NONLINEAR MIXED MODELLING CONTEXT

ON THE CONSEQUENCES OF MISSPECIFING ASSUMPTIONS CONCERNING RESIDUALS DISTRIBUTION IN A REPEATED MEASURES AND NONLINEAR MIXED MODELLING CONTEXT ON THE CONSEQUENCES OF MISSPECIFING ASSUMPTIONS CONCERNING RESIDUALS DISTRIBUTION IN A REPEATED MEASURES AND NONLINEAR MIXED MODELLING CONTEXT Rachid el Halimi and Jordi Ocaña Departament d Estadística

More information

Checking model assumptions with regression diagnostics

Checking model assumptions with regression diagnostics @graemeleehickey www.glhickey.com graeme.hickey@liverpool.ac.uk Checking model assumptions with regression diagnostics Graeme L. Hickey University of Liverpool Conflicts of interest None Assistant Editor

More information

WU Weiterbildung. Linear Mixed Models

WU Weiterbildung. Linear Mixed Models Linear Mixed Effects Models WU Weiterbildung SLIDE 1 Outline 1 Estimation: ML vs. REML 2 Special Models On Two Levels Mixed ANOVA Or Random ANOVA Random Intercept Model Random Coefficients Model Intercept-and-Slopes-as-Outcomes

More information

SAS/STAT 13.1 User s Guide. Introduction to Survey Sampling and Analysis Procedures

SAS/STAT 13.1 User s Guide. Introduction to Survey Sampling and Analysis Procedures SAS/STAT 13.1 User s Guide Introduction to Survey Sampling and Analysis Procedures This document is an individual chapter from SAS/STAT 13.1 User s Guide. The correct bibliographic citation for the complete

More information

Introduction to Mixed Models in R

Introduction to Mixed Models in R Introduction to Mixed Models in R Galin Jones School of Statistics University of Minnesota http://www.stat.umn.edu/ galin March 2011 Second in a Series Sponsored by Quantitative Methods Collaborative.

More information

5.3 Three-Stage Nested Design Example

5.3 Three-Stage Nested Design Example 5.3 Three-Stage Nested Design Example A researcher designs an experiment to study the of a metal alloy. A three-stage nested design was conducted that included Two alloy chemistry compositions. Three ovens

More information

Bayesian Model Diagnostics and Checking

Bayesian Model Diagnostics and Checking Earvin Balderama Quantitative Ecology Lab Department of Forestry and Environmental Resources North Carolina State University April 12, 2013 1 / 34 Introduction MCMCMC 2 / 34 Introduction MCMCMC Steps in

More information

Non-parametric bootstrap and small area estimation to mitigate bias in crowdsourced data Simulation study and application to perceived safety

Non-parametric bootstrap and small area estimation to mitigate bias in crowdsourced data Simulation study and application to perceived safety Non-parametric bootstrap and small area estimation to mitigate bias in crowdsourced data Simulation study and application to perceived safety David Buil-Gil, Reka Solymosi Centre for Criminology and Criminal

More information

Concepts and Applications of Kriging

Concepts and Applications of Kriging 2013 Esri International User Conference July 8 12, 2013 San Diego, California Technical Workshop Concepts and Applications of Kriging Eric Krause Konstantin Krivoruchko Outline Intro to interpolation Exploratory

More information

R-squared for Bayesian regression models

R-squared for Bayesian regression models R-squared for Bayesian regression models Andrew Gelman Ben Goodrich Jonah Gabry Imad Ali 8 Nov 2017 Abstract The usual definition of R 2 (variance of the predicted values divided by the variance of the

More information

HANDBOOK OF APPLICABLE MATHEMATICS

HANDBOOK OF APPLICABLE MATHEMATICS HANDBOOK OF APPLICABLE MATHEMATICS Chief Editor: Walter Ledermann Volume VI: Statistics PART A Edited by Emlyn Lloyd University of Lancaster A Wiley-Interscience Publication JOHN WILEY & SONS Chichester

More information

RESUMO. Palabras e frases chave: small area, R package, multinomial mixed models. 1. INTRODUCTION

RESUMO. Palabras e frases chave: small area, R package, multinomial mixed models. 1. INTRODUCTION XI Congreso Galego de Estatística e Investigación de Operacións A Coruña, 24 25 26 de outubro de 2013 mme: An R pacage for small area estimation with area level multinomial mixed models López Vizcaíno

More information

REGRESSION DIAGNOSTICS AND REMEDIAL MEASURES

REGRESSION DIAGNOSTICS AND REMEDIAL MEASURES REGRESSION DIAGNOSTICS AND REMEDIAL MEASURES Lalmohan Bhar I.A.S.R.I., Library Avenue, Pusa, New Delhi 110 01 lmbhar@iasri.res.in 1. Introduction Regression analysis is a statistical methodology that utilizes

More information

Modeling Real Estate Data using Quantile Regression

Modeling Real Estate Data using Quantile Regression Modeling Real Estate Data using Semiparametric Quantile Regression Department of Statistics University of Innsbruck September 9th, 2011 Overview 1 Application: 2 3 4 Hedonic regression data for house prices

More information

Package Delaporte. August 13, 2017

Package Delaporte. August 13, 2017 Type Package Package Delaporte August 13, 2017 Title Statistical Functions for the Delaporte Distribution Version 6.1.0 Date 2017-08-13 Description Provides probability mass, distribution, quantile, random-variate

More information

Bayesian Inference: Probit and Linear Probability Models

Bayesian Inference: Probit and Linear Probability Models Utah State University DigitalCommons@USU All Graduate Plan B and other Reports Graduate Studies 5-1-2014 Bayesian Inference: Probit and Linear Probability Models Nate Rex Reasch Utah State University Follow

More information

COLLABORATION OF STATISTICAL METHODS IN SELECTING THE CORRECT MULTIPLE LINEAR REGRESSIONS

COLLABORATION OF STATISTICAL METHODS IN SELECTING THE CORRECT MULTIPLE LINEAR REGRESSIONS American Journal of Biostatistics 4 (2): 29-33, 2014 ISSN: 1948-9889 2014 A.H. Al-Marshadi, This open access article is distributed under a Creative Commons Attribution (CC-BY) 3.0 license doi:10.3844/ajbssp.2014.29.33

More information

Small Area Estimation via Multivariate Fay-Herriot Models with Latent Spatial Dependence

Small Area Estimation via Multivariate Fay-Herriot Models with Latent Spatial Dependence Small Area Estimation via Multivariate Fay-Herriot Models with Latent Spatial Dependence Aaron T. Porter 1, Christopher K. Wikle 2, Scott H. Holan 2 arxiv:1310.7211v1 [stat.me] 27 Oct 2013 Abstract The

More information

Multivariate beta regression with application to small area estimation

Multivariate beta regression with application to small area estimation Multivariate beta regression with application to small area estimation Debora Ferreira de Souza debora@dme.ufrj.br Fernando Antônio da Silva Moura fmoura@im.ufrj.br Departamento de Métodos Estatísticos

More information

Diagnostics and Remedial Measures

Diagnostics and Remedial Measures Diagnostics and Remedial Measures Yang Feng http://www.stat.columbia.edu/~yangfeng Yang Feng (Columbia University) Diagnostics and Remedial Measures 1 / 72 Remedial Measures How do we know that the regression

More information

Concepts and Applications of Kriging. Eric Krause

Concepts and Applications of Kriging. Eric Krause Concepts and Applications of Kriging Eric Krause Sessions of note Tuesday ArcGIS Geostatistical Analyst - An Introduction 8:30-9:45 Room 14 A Concepts and Applications of Kriging 10:15-11:30 Room 15 A

More information

SAS/STAT 14.2 User s Guide. Introduction to Survey Sampling and Analysis Procedures

SAS/STAT 14.2 User s Guide. Introduction to Survey Sampling and Analysis Procedures SAS/STAT 14.2 User s Guide Introduction to Survey Sampling and Analysis Procedures This document is an individual chapter from SAS/STAT 14.2 User s Guide. The correct bibliographic citation for this manual

More information

Time-series small area estimation for unemployment based on a rotating panel survey

Time-series small area estimation for unemployment based on a rotating panel survey Discussion Paper Time-series small area estimation for unemployment based on a rotating panel survey The views expressed in this paper are those of the author and do not necessarily relect the policies

More information

Spatial Modeling and Prediction of County-Level Employment Growth Data

Spatial Modeling and Prediction of County-Level Employment Growth Data Spatial Modeling and Prediction of County-Level Employment Growth Data N. Ganesh Abstract For correlated sample survey estimates, a linear model with covariance matrix in which small areas are grouped

More information

APPENDIX 1 BASIC STATISTICS. Summarizing Data

APPENDIX 1 BASIC STATISTICS. Summarizing Data 1 APPENDIX 1 Figure A1.1: Normal Distribution BASIC STATISTICS The problem that we face in financial analysis today is not having too little information but too much. Making sense of large and often contradictory

More information

Mandallaz model-assisted small area estimators

Mandallaz model-assisted small area estimators Mandallaz model-assisted small area estimators Andreas Dominik Cullmann May 30, 2016 1 Introduction Model-based small area estimators (for example [1], chapters 5ff. depend on model assumptions to hold.

More information

Package metansue. July 11, 2018

Package metansue. July 11, 2018 Type Package Package metansue July 11, 2018 Title Meta-Analysis of Studies with Non Statistically-Significant Unreported Effects Version 2.2 Date 2018-07-11 Author Joaquim Radua Maintainer Joaquim Radua

More information

Concepts and Applications of Kriging

Concepts and Applications of Kriging Esri International User Conference San Diego, California Technical Workshops July 24, 2012 Concepts and Applications of Kriging Konstantin Krivoruchko Eric Krause Outline Intro to interpolation Exploratory

More information

LFS quarterly small area estimation of youth unemployment at provincial level

LFS quarterly small area estimation of youth unemployment at provincial level LFS quarterly small area estimation of youth unemployment at provincial level Stima trimestrale della disoccupazione giovanile a livello provinciale su dati RFL Michele D Aló, Stefano Falorsi, Silvia Loriga

More information

Conditional density estimation: an application to the Ecuadorian manufacturing sector. Abstract

Conditional density estimation: an application to the Ecuadorian manufacturing sector. Abstract Conditional density estimation: an application to the Ecuadorian manufacturing sector Kim Huynh Indiana University David Jacho-Chavez Indiana University Abstract This note applies conditional density estimation

More information

Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of

Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of Preface Introduction to Statistics and Data Analysis Overview: Statistical Inference, Samples, Populations, and Experimental Design The Role of Probability Sampling Procedures Collection of Data Measures

More information

Outlier robust small area estimation

Outlier robust small area estimation University of Wollongong Research Online Faculty of Engineering and Information Sciences - Papers: Part A Faculty of Engineering and Information Sciences 2014 Outlier robust small area estimation Ray Chambers

More information

Course in Data Science

Course in Data Science Course in Data Science About the Course: In this course you will get an introduction to the main tools and ideas which are required for Data Scientist/Business Analyst/Data Analyst. The course gives an

More information

Small Area Confidence Bounds on Small Cell Proportions in Survey Populations

Small Area Confidence Bounds on Small Cell Proportions in Survey Populations Small Area Confidence Bounds on Small Cell Proportions in Survey Populations Aaron Gilary, Jerry Maples, U.S. Census Bureau U.S. Census Bureau Eric V. Slud, U.S. Census Bureau Univ. Maryland College Park

More information

From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author...

From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. About This Book... xiii About The Author... From Practical Data Analysis with JMP, Second Edition. Full book available for purchase here. Contents About This Book... xiii About The Author... xxiii Chapter 1 Getting Started: Data Analysis with JMP...

More information

Longitudinal + Reliability = Joint Modeling

Longitudinal + Reliability = Joint Modeling Longitudinal + Reliability = Joint Modeling Carles Serrat Institute of Statistics and Mathematics Applied to Building CYTED-HAROSA International Workshop November 21-22, 2013 Barcelona Mainly from Rizopoulos,

More information

boxcoxmix: An R Package for Response Transformations for Random Effect and Variance Component Models

boxcoxmix: An R Package for Response Transformations for Random Effect and Variance Component Models boxcoxmix: An R Package for Response Transformations for Random Effect and Variance Component Models Amani Almohaimeed and Jochen Einbeck Qassim University and Durham University Abstract Random effect

More information

Model Fitting. Jean Yves Le Boudec

Model Fitting. Jean Yves Le Boudec Model Fitting Jean Yves Le Boudec 0 Contents 1. What is model fitting? 2. Linear Regression 3. Linear regression with norm minimization 4. Choosing a distribution 5. Heavy Tail 1 Virus Infection Data We

More information

BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1

BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1 BANA 7046 Data Mining I Lecture 2. Linear Regression, Model Assessment, and Cross-validation 1 Shaobo Li University of Cincinnati 1 Partially based on Hastie, et al. (2009) ESL, and James, et al. (2013)

More information

Forecasting with R A practical workshop

Forecasting with R A practical workshop Forecasting with R A practical workshop International Symposium on Forecasting 2016 19 th June 2016 Nikolaos Kourentzes nikolaos@kourentzes.com http://nikolaos.kourentzes.com Fotios Petropoulos fotpetr@gmail.com

More information

Probabilistic temperature post-processing using a skewed response distribution

Probabilistic temperature post-processing using a skewed response distribution Probabilistic temperature post-processing using a skewed response distribution Manuel Gebetsberger 1, Georg J. Mayr 1, Reto Stauffer 2, Achim Zeileis 2 1 Institute of Atmospheric and Cryospheric Sciences,

More information

Regression Analysis III: Advanced Methods

Regression Analysis III: Advanced Methods Regression Analysis III: Advanced Methods Dave Armstrong University of Oxford david.armstrong@politics.ox.ac.uk Teaching Assistant: Matthew Painter, Ohio State University Painter.63@sociology.osu.edu Course

More information

SAS/STAT 13.2 User s Guide. Introduction to Survey Sampling and Analysis Procedures

SAS/STAT 13.2 User s Guide. Introduction to Survey Sampling and Analysis Procedures SAS/STAT 13.2 User s Guide Introduction to Survey Sampling and Analysis Procedures This document is an individual chapter from SAS/STAT 13.2 User s Guide. The correct bibliographic citation for the complete

More information

Bayesian closed skew Gaussian inversion of seismic AVO data into elastic material properties

Bayesian closed skew Gaussian inversion of seismic AVO data into elastic material properties Bayesian closed skew Gaussian inversion of seismic AVO data into elastic material properties Omid Karimi 1,2, Henning Omre 2 and Mohsen Mohammadzadeh 1 1 Tarbiat Modares University, Iran 2 Norwegian University

More information

MEMORANDUM. Fish Passage Advisory Committee. Fish Passage Center. DATE: May 15, 2018

MEMORANDUM. Fish Passage Advisory Committee. Fish Passage Center. DATE: May 15, 2018 FISH PASSAGE CENTER 847 N.E. 19 th Avenue, #250, Portland, Oregon 97232 Phone: (503) 833-3900 Fax: (503) 232-1259 www.fpc.org e-mail us at fpcstaff@fpc.org MEMORANDUM TO: FROM: Fish Passage Advisory Committee

More information

INFERENCE FOR MULTIPLE LINEAR REGRESSION MODEL WITH EXTENDED SKEW NORMAL ERRORS

INFERENCE FOR MULTIPLE LINEAR REGRESSION MODEL WITH EXTENDED SKEW NORMAL ERRORS Pak. J. Statist. 2016 Vol. 32(2), 81-96 INFERENCE FOR MULTIPLE LINEAR REGRESSION MODEL WITH EXTENDED SKEW NORMAL ERRORS A.A. Alhamide 1, K. Ibrahim 1 M.T. Alodat 2 1 Statistics Program, School of Mathematical

More information

Introduction to Linear regression analysis. Part 2. Model comparisons

Introduction to Linear regression analysis. Part 2. Model comparisons Introduction to Linear regression analysis Part Model comparisons 1 ANOVA for regression Total variation in Y SS Total = Variation explained by regression with X SS Regression + Residual variation SS Residual

More information

ARIC Manuscript Proposal # PC Reviewed: _9/_25_/06 Status: A Priority: _2 SC Reviewed: _9/_25_/06 Status: A Priority: _2

ARIC Manuscript Proposal # PC Reviewed: _9/_25_/06 Status: A Priority: _2 SC Reviewed: _9/_25_/06 Status: A Priority: _2 ARIC Manuscript Proposal # 1186 PC Reviewed: _9/_25_/06 Status: A Priority: _2 SC Reviewed: _9/_25_/06 Status: A Priority: _2 1.a. Full Title: Comparing Methods of Incorporating Spatial Correlation in

More information

Eric V. Slud, Census Bureau & Univ. of Maryland Mathematics Department, University of Maryland, College Park MD 20742

Eric V. Slud, Census Bureau & Univ. of Maryland Mathematics Department, University of Maryland, College Park MD 20742 COMPARISON OF AGGREGATE VERSUS UNIT-LEVEL MODELS FOR SMALL-AREA ESTIMATION Eric V. Slud, Census Bureau & Univ. of Maryland Mathematics Department, University of Maryland, College Park MD 20742 Key words:

More information

Directed acyclic graphs and the use of linear mixed models

Directed acyclic graphs and the use of linear mixed models Directed acyclic graphs and the use of linear mixed models Siem H. Heisterkamp 1,2 1 Groningen Bioinformatics Centre, University of Groningen 2 Biostatistics and Research Decision Sciences (BARDS), MSD,

More information

Index. Pagenumbersfollowedbyf indicate figures; pagenumbersfollowedbyt indicate tables.

Index. Pagenumbersfollowedbyf indicate figures; pagenumbersfollowedbyt indicate tables. Index Pagenumbersfollowedbyf indicate figures; pagenumbersfollowedbyt indicate tables. Adaptive rejection metropolis sampling (ARMS), 98 Adaptive shrinkage, 132 Advanced Photo System (APS), 255 Aggregation

More information

Bayesian Estimation Under Informative Sampling with Unattenuated Dependence

Bayesian Estimation Under Informative Sampling with Unattenuated Dependence Bayesian Estimation Under Informative Sampling with Unattenuated Dependence Matt Williams 1 Terrance Savitsky 2 1 Substance Abuse and Mental Health Services Administration Matthew.Williams@samhsa.hhs.gov

More information

Longitudinal and Panel Data: Analysis and Applications for the Social Sciences. Table of Contents

Longitudinal and Panel Data: Analysis and Applications for the Social Sciences. Table of Contents Longitudinal and Panel Data Preface / i Longitudinal and Panel Data: Analysis and Applications for the Social Sciences Table of Contents August, 2003 Table of Contents Preface i vi 1. Introduction 1.1

More information

ECLT 5810 Data Preprocessing. Prof. Wai Lam

ECLT 5810 Data Preprocessing. Prof. Wai Lam ECLT 5810 Data Preprocessing Prof. Wai Lam Why Data Preprocessing? Data in the real world is imperfect incomplete: lacking attribute values, lacking certain attributes of interest, or containing only aggregate

More information

Multilevel modeling and panel data analysis in educational research (Case study: National examination data senior high school in West Java)

Multilevel modeling and panel data analysis in educational research (Case study: National examination data senior high school in West Java) Multilevel modeling and panel data analysis in educational research (Case study: National examination data senior high school in West Java) Pepi Zulvia, Anang Kurnia, and Agus M. Soleh Citation: AIP Conference

More information

Generalized Linear Models

Generalized Linear Models Generalized Linear Models Lecture 3. Hypothesis testing. Goodness of Fit. Model diagnostics GLM (Spring, 2018) Lecture 3 1 / 34 Models Let M(X r ) be a model with design matrix X r (with r columns) r n

More information

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij =

K. Model Diagnostics. residuals ˆɛ ij = Y ij ˆµ i N = Y ij Ȳ i semi-studentized residuals ω ij = ˆɛ ij. studentized deleted residuals ɛ ij = K. Model Diagnostics We ve already seen how to check model assumptions prior to fitting a one-way ANOVA. Diagnostics carried out after model fitting by using residuals are more informative for assessing

More information

0.1 gamma.mixed: Mixed effects gamma regression

0.1 gamma.mixed: Mixed effects gamma regression 0. gamma.mixed: Mixed effects gamma regression Use generalized multi-level linear regression if you have covariates that are grouped according to one or more classification factors. Gamma regression models

More information

Outline. Mixed models in R using the lme4 package Part 3: Longitudinal data. Sleep deprivation data. Simple longitudinal data

Outline. Mixed models in R using the lme4 package Part 3: Longitudinal data. Sleep deprivation data. Simple longitudinal data Outline Mixed models in R using the lme4 package Part 3: Longitudinal data Douglas Bates Longitudinal data: sleepstudy A model with random effects for intercept and slope University of Wisconsin - Madison

More information

Item Reliability Analysis

Item Reliability Analysis Item Reliability Analysis Revised: 10/11/2017 Summary... 1 Data Input... 4 Analysis Options... 5 Tables and Graphs... 5 Analysis Summary... 6 Matrix Plot... 8 Alpha Plot... 10 Correlation Matrix... 11

More information

Spatial Variation in Hospitalizations for Cardiometabolic Ambulatory Care Sensitive Conditions Across Canada

Spatial Variation in Hospitalizations for Cardiometabolic Ambulatory Care Sensitive Conditions Across Canada Spatial Variation in Hospitalizations for Cardiometabolic Ambulatory Care Sensitive Conditions Across Canada CRDCN Conference November 14, 2017 Martin Cooke Alana Maltby Sarah Singh Piotr Wilk Today s

More information

Beyond Mean Regression

Beyond Mean Regression Beyond Mean Regression Thomas Kneib Lehrstuhl für Statistik Georg-August-Universität Göttingen 8.3.2013 Innsbruck Introduction Introduction One of the top ten reasons to become statistician (according

More information