Bayesian Estimation Under Informative Sampling with Unattenuated Dependence

Similar documents
Developing Calibration Weights and Variance Estimates for a Survey of Drug-Related Emergency-Room Visits

Estimation of small-area proportions using covariates and survey data

Small Area Modeling of County Estimates for Corn and Soybean Yields in the US

Variance Estimation of the Survey-Weighted Kappa Measure of Agreement

Fractional Imputation in Survey Sampling: A Comparative Review

REPLICATION VARIANCE ESTIMATION FOR THE NATIONAL RESOURCES INVENTORY

Taking into account sampling design in DAD. Population SAMPLING DESIGN AND DAD

A measurement error model approach to small area estimation

Sample size and Sampling strategy

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Sanjay Chaudhuri Department of Statistics and Applied Probability, National University of Singapore

Data Integration for Big Data Analysis for finite population inference

Part 7: Glossary Overview

Frailty Modeling for Spatially Correlated Survival Data, with Application to Infant Mortality in Minnesota By: Sudipto Banerjee, Mela. P.

Incorporating Level of Effort Paradata in Nonresponse Adjustments. Paul Biemer RTI International University of North Carolina Chapel Hill

Propensity Score Weighting with Multilevel Data

HB Methods for Combining Estimates from Multiple Surveys

Introduction to Survey Data Analysis

Statistical Analysis of List Experiments

New Developments in Econometrics Lecture 9: Stratified Sampling

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs

Spatial Variation in Hospitalizations for Cardiometabolic Ambulatory Care Sensitive Conditions Across Canada

Gibbs Sampling in Endogenous Variables Models

Lecture 5: Sampling Methods

Targeted Maximum Likelihood Estimation for Adaptive Designs: Adaptive Randomization in Community Randomized Trial

Chapter 3: Element sampling design: Part 1

Constrained Maximum Likelihood Estimation for Model Calibration Using Summary-level Information from External Big Data Sources

Monte Carlo Study on the Successive Difference Replication Method for Non-Linear Statistics

,..., θ(2),..., θ(n)

multilevel modeling: concepts, applications and interpretations

In Praise of the Listwise-Deletion Method (Perhaps with Reweighting)

A MODEL-BASED EVALUATION OF SEVERAL WELL-KNOWN VARIANCE ESTIMATORS FOR THE COMBINED RATIO ESTIMATOR

Complex sample design effects and inference for mental health survey data

A Bayesian Partitioning Approach to Duplicate Detection and Record Linkage

STAT 5500/6500 Conditional Logistic Regression for Matched Pairs

Lecturer: Dr. Adote Anum, Dept. of Psychology Contact Information:

Combining multiple observational data sources to estimate causal eects

Day 8: Sampling. Daniel J. Mallinson. School of Public Affairs Penn State Harrisburg PADM-HADM 503

On the occurrence times of componentwise maxima and bias in likelihood inference for multivariate max-stable distributions

Sampling from Finite Populations Jill M. Montaquila and Graham Kalton Westat 1600 Research Blvd., Rockville, MD 20850, U.S.A.

Unequal Probability Designs

Multivariate Survival Analysis

Log-linear Modelling with Complex Survey Data

Eric V. Slud, Census Bureau & Univ. of Maryland Mathematics Department, University of Maryland, College Park MD 20742

CSC321 Lecture 18: Learning Probabilistic Models

Conducting Fieldwork and Survey Design

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

ICES training Course on Design and Analysis of Statistically Sound Catch Sampling Programmes

Formalizing the Concepts: Simple Random Sampling. Juan Muñoz Kristen Himelein March 2012

PIRLS 2016 Achievement Scaling Methodology 1

The STS Surgeon Composite Technical Appendix

Aggregating Ordinal Labels from Crowds by Minimax Conditional Entropy. Denny Zhou Qiang Liu John Platt Chris Meek

Introduction to Survey Sampling

Figure 36: Respiratory infection versus time for the first 49 children.

Define characteristic function. State its properties. State and prove inversion theorem.

Eco517 Fall 2014 C. Sims FINAL EXAM

Variance estimation on SILC based indicators

Fitting Multidimensional Latent Variable Models using an Efficient Laplace Approximation

Stratified Randomized Experiments

STAT 525 Fall Final exam. Tuesday December 14, 2010

Formalizing the Concepts: Simple Random Sampling. Juan Muñoz Kristen Himelein March 2013

Targeted Maximum Likelihood Estimation in Safety Analysis

Selection on Observables: Propensity Score Matching.

Bayesian regression tree models for causal inference: regularization, confounding and heterogeneity

Sampling. Jian Pei School of Computing Science Simon Fraser University

Causal Inference with a Continuous Treatment and Outcome: Alternative Estimators for Parametric Dose-Response Functions

University of California at Berkeley Fall Introductory Applied Econometrics Final examination. Scores add up to 125 points

Contextual Effects in Modeling for Small Domains

Module 4 Approaches to Sampling. Georgia Kayser, PhD The Water Institute

DATA-ADAPTIVE VARIABLE SELECTION FOR

Even Simpler Standard Errors for Two-Stage Optimization Estimators: Mata Implementation via the DERIV Command

Introduction to Spatial Data and Models

2011 NATIONAL SURVEY ON DRUG USE AND HEALTH

Advanced Methods for Agricultural and Agroenvironmental. Emily Berg, Zhengyuan Zhu, Sarah Nusser, and Wayne Fuller

Learning Objectives for Stat 225

Empirical likelihood inference for regression parameters when modelling hierarchical complex survey data

Causal Hazard Ratio Estimation By Instrumental Variables or Principal Stratification. Todd MacKenzie, PhD

Quantile POD for Hit-Miss Data

Topic 3 Populations and Samples

Introduction to Statistical Data Analysis Lecture 4: Sampling

Part 8: GLMs and Hierarchical LMs and GLMs

Previous lecture. P-value based combination. Fixed vs random effects models. Meta vs. pooled- analysis. New random effects testing.

VOL. 3, NO.6, June 2013 ISSN ARPN Journal of Science and Technology All rights reserved.

6. Fractional Imputation in Survey Sampling

Some Concepts of Probability (Review) Volker Tresp Summer 2018

Relating Latent Class Analysis Results to Variables not Included in the Analysis

What s New in Econometrics. Lecture 13

Plausible Values for Latent Variables Using Mplus

1 Motivation for Instrumental Variable (IV) Regression

Bayesian nonparametric estimation of finite population quantities in absence of design information on nonsampled units

INSTRUMENTAL-VARIABLE CALIBRATION ESTIMATION IN SURVEY SAMPLING

A Bayesian Nonparametric Model for Predicting Disease Status Using Longitudinal Profiles

Constrained estimation for binary and survival data

Pattern Recognition and Machine Learning. Bishop Chapter 2: Probability Distributions

H-LIKELIHOOD ESTIMATION METHOOD FOR VARYING CLUSTERED BINARY MIXED EFFECTS MODEL

Households or locations? Cities, catchment areas and prosperity in India

Bayesian inference for multivariate extreme value distributions

Empirical and Constrained Empirical Bayes Variance Estimation Under A One Unit Per Stratum Sample Design

Model Selection Tutorial 2: Problems With Using AIC to Select a Subset of Exposures in a Regression Model

Finite Population Sampling and Inference

Transcription:

Bayesian Estimation Under Informative Sampling with Unattenuated Dependence Matt Williams 1 Terrance Savitsky 2 1 Substance Abuse and Mental Health Services Administration Matthew.Williams@samhsa.hhs.gov 2 Bureau of Labor Statistics Savitsky.Terrance@bls.gov Federal Committee on Statistical Methodology Research and Policy Conference March 9, 2018 1/19

Inference from Survey Samples Goal of Analyst: perform inference about a finite population generated from an unknown model, P 0. Data Collected: from under a complex sampling design distribution, P ν Probabilities of inclusion are often associated with the variable of interest (purposefully) Sampling designs are informative : the balance of information in the sample balance in the population. Biased Estimation: estimate P 0 without accounting for P ν. 2/19

What does it mean for a Design to be Informative? Consistency Theory (estimators collapsing to true P 0 ) Empashize use of inverse probability weights, w i = 1/π i using marginal (individual) probabilities of being selected Weights are used as plug-in corrections (Savitsky and Toth, 2016) or co-modelled (Novelo and Savitsky, 2017) Sampled elements assumed to become more and more independent with larger sample/population sizes (π ij π i π j ) Estimation in Practice Weighting, stratification, and clustering are all incoporated. Variances estimated by aggregating up to the (asymptotically) independent units (cluster ID s or primary sampling units (PSU s). Individual units aren t assumed to be sampled independently in general (π ij π i π j ). 3/19

Motivating Example: The National Survey on Drug Use and Health (NSDUH) Sponsored by Substance Abuse and Mental Health Services Administration (SAMHSA) Primary source for: alcohol, illicit drug, substance use disorder, mental health issues and their co-occurrence Civilian, non-institutionalized population for 12 and older in the US. Multistage, state-based sampling design (Morton et al., 2016) Early stages defined by geography: Select dwelling units (DU s) nested within census block groups (PSU s) Geographic units stratified implicitly via sorting on socioeconomic indicators and selected proportional to compositive population size measure (Systematic PPS). DU s selected within segment via random starting point and selecting every k th unit (Systematic) Individuals (0,1, or 2), selected jointly as pairs with different probabilities assigned to different age-pair combinations (Unequal joint selection of pairs) 4/19

NSDUH: Depression and Smoking (12 and older, 2014) 0.4 0.3 Past Month Smoker 0.2 0.1 weight Equal Prob 0.0 No Yes Past Year Major Depressive Episode 5/19

Motivating Example: NSDUH (continued) Smoking and depression have the potential to cluster geographically and within dwelling units, since related demographics (age, urbanicity, education, etc) may cluster. Current literature is silent on the issue of non-ignorable clustering that may be informative (i.e. related to the response of interest). We present conditions for asymptotic consistency for designs like the NSDUH with common features such as: Cluster sampling, selecting only one unit per cluster, or selecting multiple individuals from a dwelling unit. Population information used to sort sampling units along gradients. 6/19

Estimation Methods to Account for Informative Sampling First-order weights and likelihood Plug-in to exponentiate likelihood (Savitsky and Toth, 2016) Joint modeling of weights and outcome (Novelo and Savitsky, 2017) Second-order (pairwise weights) and composite likelihood Clever way to perform hierarchical modelling when design and population model have same structure (Yi et al., 2016) Only beats first-order weights under specific conditions for population model and sample (Williams and Savitsky, 2017) E.g: conditional behavior of spouse-spouse pairs within households AND differential selection of pairs of individuals within a household related to outcome AND subsetting of sample based on information only available in the sample (pair relationship) For simplicity, we focus on the first-order weights using the plug-in method 7/19

The Pseudo-Posterior Estimator (Savitsky and Toth, 2016) The plug-in estimator for posterior density under the analyst-specified model for λ Λ is [ n ] ˆπ (λ y o, w) p (y o,i λ) w i π (λ), (1) i=1 pseudo-likelihood: n i=1 p (y o,i λ) w i prior: π (λ) values y o and sampling weights { w} for individuals observed in sample 8/19

Consistency Conditions (Savitsky and Toth, 2016) Based on Empirical Process functionals (Ghosal and van der Vaart, 2007). Population U ν of size N ν growing with index ν. (A1) (Local entropy condition - Size of model) (A2) (Size of space) (A3) (Prior mass covering the truth) (A4) (Non-zero Inclusion Probabilities) sup ν 1 γ, with P 0 probability 1. min π νi i U ν (A6) (Constant Sampling fraction) For constant f (0, 1), lim sup n ν f N ν = O(1), with P 0 probability 1. ν 9/19

Consistency Conditions (NEW) (A5.1) (Growth of dependence is restricted) Binary partition {S ν1, S ν2 } of the set of all pairs S ν = {{i, j} : i j U ν } such that and lim sup ν max i,j S ν2 lim sup S ν1 = O(N ν ), ν π νij 1 π νi π νj = O( Nν 1 ), P0 probability 1 such that for some constants, C 4, C 5 > 0 and for N ν sufficiently large, S ν1 C 4 N ν, and N ν sup max π νij ν i,j S ν2 1 π νi π νj C 5, (A5.2) (Special Case: Dependence restricted to countable blocks of bounded size) 10/19

Theorem (Updated) Theorem Suppose conditions hold. Then for sets P Nν P, constants, K > 0, and M sufficiently large, E P0,P ν Π π ( P : dn π ) ν (P, P 0 ) Mξ Nν X 1 δ ν1,..., X Nν δ νnν ( 16γ 2 [γc 2 + C 3 ] (Kf + 1 2γ) 2 + 5γ exp Kn ) νξn 2 ν, (2) N ν ξn 2 2γ ν which tends to 0 as (n ν, N ν ). Note that C 2 = C 4 + 1. When S ν1 0 then C 2 = 1, leading to main result in Savitsky and Toth (2016). 11/19

Simulation Examples: Two (Three) Parameter Logistic Regression ind y i µ i Bern ( F 1 (µ i ) ), i = 1,..., N l µ = 1.88 + 1.0x1 + 0.5x2 The x 1 and x 2 distributions are N (0, 1) and E(r = 1/5) with rate r Size measure used for sample selection is x2 = x2 min(x2) + 1, but neither x2 or x2 are available to the analyst. 12/19

Simulation Example: Three-Stage Sample Area (PPS), Household (Systematic, sorting by Size), Individual (PPS) Deviation 40 30 20 10 0 Deviation 1.0 0.5 0.0 0.5 1.0 Figure: Factorization matrix (π ij /(π i π j ) 1) for two PSU s. Magnitude (left) and Sign (right). Systematic Sampling (π ij = 0). Clustering and PPS sampling (π ij > π i π j ). Independent first stage sample (π ij = π i π j ) 13/19

Simulation Example: Three-Stage Sample (Cont) 2 50 100 200 400 800 1 0 1 0.0 2.5 5.0 7.5 0 1 2 3 Curve logbias logmse 4 5 2 1 0 1 2 2 1 0 1 2 2 1 0 1 2 2 1 0 1 2 2 1 0 1 2 x Figure: The marginal estimate of µ = f (x 1 ). population curve, sample with equal weights, and inverse probability weights. Top to bottom: estimated curve, log of BIAS, log MSE. Left to right: sample size (50 to 800). 14/19

Simulation Example: Sorted Partitions Sort population by x2 Partition the population U into a high (U 1 ) group with the top N/2 and a low (U 2 ) group with the bottom N/2 Only two possible samples of size N/2: U 1 and U 2. Assume an equal probability of selection of 1/2. π i = 1/2, for all i 1,..., N, and π ij = 1/2 if i j U k, for k = 1, 2 and 0 otherwise. Thus, the number of pairwise inclusions probabilities that do not factor (π ij 1/4) grows at rate O ( N 2), violating condition (A5.1). Alternatively, nest partitions within strata (every 50 sorted units) and sample indepedently across strata factorization for all but O(N) pairwise inclusion probabilities statisfies condition (A5.1) 15/19

Simulation Example: Sorted Partitions (Cont) 0 2 4 Deviation 1.0 0.5 8 16 32 0.0 0.5 1.0 Figure: Matrix {i, j} of deviations from factorization (π ij /(π i π j ) 1) for an equal probability dyadic partition design by number of strata (0 to 32). Empty cells correspond to 0 deviation (factorization). 16/19

Simulation Example: Sorted Partitions (Cont) 4 100 200 400 800 1600 2 y 0 2 0 4 8 12 2.5 0.0 2.5 5.0 Curve logbias logmse 7.5 10.0 2 1 0 1 2 2 1 0 1 2 2 1 0 1 2 2 1 0 1 2 2 1 0 1 2 x Figure: The marginal estimate of µ = f (x 1 ) Compares the (true) population curve to binary partition (high and low) and the stratified partition sample. Top to bottom: estimated curve, log of absolute bias, log of mean square error. Left to right: population size (100 to 1600). 17/19

Conclusions and Future Research Motivation: lack of theory to justify consistent estimation for complex, multistage cluster designs such as the NSDUH Approximate or asymptotic factorization of joint sampling probabilities exclude these designs Results: New theoretical conditions now allow for realistic dependence between sampling units. Dependence between units within a cluster is unrestricted if cluster size is bounded and dependence between clusters attenuates. This dependence can be positive (joint selection) or negative (mutual exclusion). Units sorted along a gradient can now be fully justified if the sampling occurs independently across strata or within clusters. Current Research: Impact of survey design on posterior interval estimation. Correcting for mispecification of sampling design efficiency (effective sample size) 18/19

References Ghosal, S. and van der Vaart, A. (2007), Convergence rates of posterior distributions for noniid observations, Ann. Statist. 35(1), 192 223. URL: http://dx.doi.org/10.1214/009053606000001172 Morton, K. B., Aldworth, J., Hirsch, E. L., Martin, P. C. and Shook-Sa, B. E. (2016), Section 2, sample design report, in 2014 National Survey on Drug Use and Health: Methodological Resource Book, Center for Behavioral Health Statistics and Quality, Substance Abuse and Mental Health Services Administration, Rockville, MD. Novelo, L. L. and Savitsky, T. (2017), Fully Bayesian Estimation Under Informative Sampling, ArXiv e-prints. URL: https://arxiv.org/abs/1710.00019 Savitsky, T. D. and Toth, D. (2016), Bayesian Estimation Under Informative Sampling, Electronic Journal of Statistics 10(1), 1677 1708. Williams, M. R. and Savitsky, T. D. (2017), Bayesian Pairwise Estimation Under Dependent Informative Sampling, ArXiv e-prints. URL: https://arxiv.org/abs/1710.10102 Yi, G. Y., Rao, J. N. K. and Li, H. (2016), A Weighted Composite Likelihood Approach for Analysis of Survey Data under Two-level Models, Statistica Sinica 26, 569 587. 19/19