Monte Carlo Techniques for Regressing Random Variables. Dan Pemstein. Using V-Dem the Right Way

Similar documents
Bayesian Inference and MCMC

(5) Multi-parameter models - Gibbs sampling. ST440/540: Applied Bayesian Analysis

Monte Carlo in Bayesian Statistics

Markov Chain Monte Carlo, Numerical Integration

BAYESIAN METHODS FOR VARIABLE SELECTION WITH APPLICATIONS TO HIGH-DIMENSIONAL DATA

Risk Estimation and Uncertainty Quantification by Markov Chain Monte Carlo Methods

STA 4273H: Sta-s-cal Machine Learning

Machine Learning for Data Science (CS4786) Lecture 24

Lecture: Gaussian Process Regression. STAT 6474 Instructor: Hongxiao Zhu

Principles of Bayesian Inference

Probabilistic Machine Learning

VCMC: Variational Consensus Monte Carlo

Principles of Bayesian Inference

David Giles Bayesian Econometrics

STA 4273H: Statistical Machine Learning

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

STA 216, GLM, Lecture 16. October 29, 2007

MCMC algorithms for fitting Bayesian models

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Density Estimation. Seungjin Choi

Beyond MCMC in fitting complex Bayesian models: The INLA method

STA 4273H: Statistical Machine Learning

Principles of Bayesian Inference

Tutorial on Probabilistic Programming with PyMC3

An ABC interpretation of the multiple auxiliary variable method

Markov chain Monte Carlo

Bayesian inference for multivariate extreme value distributions

Bayesian Methods for Machine Learning

Likelihood NIPS July 30, Gaussian Process Regression with Student-t. Likelihood. Jarno Vanhatalo, Pasi Jylanki and Aki Vehtari NIPS-2009

David Giles Bayesian Econometrics

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

STA414/2104 Statistical Methods for Machine Learning II

Bayes Nets: Sampling

Large-scale Ordinal Collaborative Filtering

Principles of Bayesian Inference

Bayesian Phylogenetics:

Infer relationships among three species: Outgroup:

Bayesian Regression Linear and Logistic Regression

Nonparametric Bayesian Methods (Gaussian Processes)

CSC321 Lecture 18: Learning Probabilistic Models

Learning the hyper-parameters. Luca Martino

Reminder of some Markov Chain properties:

Machine learning: Hypothesis testing. Anders Hildeman

Notes on pseudo-marginal methods, variational Bayes and ABC

eqr094: Hierarchical MCMC for Bayesian System Reliability

Markov Chain Monte Carlo methods

Phylogenetics: Bayesian Phylogenetic Analysis. COMP Spring 2015 Luay Nakhleh, Rice University

CS 343: Artificial Intelligence

Interval estimation. October 3, Basic ideas CLT and CI CI for a population mean CI for a population proportion CI for a Normal mean

Artificial Intelligence

Using training sets and SVD to separate global 21-cm signal from foreground and instrument systematics

Introduc)on to Bayesian Methods

One-parameter models

The devil is in the denominator

Data Analysis and Uncertainty Part 2: Estimation

Bayesian Model Diagnostics and Checking

Variational Inference via Stochastic Backpropagation

Recall that the AR(p) model is defined by the equation

STA 4273H: Statistical Machine Learning

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference

Bayesian Inference: Probit and Linear Probability Models

Introduction to Bayesian Inference

(1) Introduction to Bayesian statistics

Bagging During Markov Chain Monte Carlo for Smoother Predictions

Bayesian Methods in Multilevel Regression

Probabilistic Graphical Models

Descriptive statistics

MAS3301 Bayesian Statistics

Partial factor modeling: predictor-dependent shrinkage for linear regression

Markov Chain Monte Carlo Methods

Fractional Hot Deck Imputation for Robust Inference Under Item Nonresponse in Survey Sampling

Review. DS GA 1002 Statistical and Mathematical Models. Carlos Fernandez-Granda

Bayesian GLMs and Metropolis-Hastings Algorithm

Strong Lens Modeling (II): Statistical Methods

1 Probabilities. 1.1 Basics 1 PROBABILITIES

A Fully Nonparametric Modeling Approach to. BNP Binary Regression

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Afternoon Meeting on Bayesian Computation 2018 University of Reading

Model Selection for Gaussian Processes

ECE521 Lecture 19 HMM cont. Inference in HMM

Web Appendix for Hierarchical Adaptive Regression Kernels for Regression with Functional Predictors by D. B. Woodard, C. Crainiceanu, and D.

Probability. We will now begin to explore issues of uncertainty and randomness and how they affect our view of nature.

A Bayesian Approach to Phylogenetics

Robust Monte Carlo Methods for Sequential Planning and Decision Making

Default Priors and Effcient Posterior Computation in Bayesian

Bayesian Inference. Anders Gorm Pedersen. Molecular Evolution Group Center for Biological Sequence Analysis Technical University of Denmark (DTU)

Bayesian Networks in Educational Assessment

Introduction to Restricted Boltzmann Machines

Graphical Models and Kernel Methods


Markov Networks.

Lecture 6: Graphical Models: Learning

Modern Methods of Data Analysis - WS 07/08

The Origin of Deep Learning. Lili Mou Jan, 2015

Fundamental Probability and Statistics

Likelihood-free inference and approximate Bayesian computation for stochastic modelling

An Introduction to Reversible Jump MCMC for Bayesian Networks, with Application

Machine Learning. Probabilistic KNN.

Bayesian linear regression

Estimation of a bivariate hierarchical Ornstein-Uhlenbeck model for longitudinal data

Transcription:

: Monte Carlo Techniques for Regressing Random Variables

Workshop Goals (Re)familiarize you with Monte Carlo methods for estimating functions of random variables, integrating/marginalizing (Re)familiarize you with how to work with the output of Markov chain Monte Carlo (MCMC) simulations Introduce the V-Dem measurement model Explain how to incorporate measurement uncertainty in V-Dem variables into statistical analyses (regressing random variables)

The Monte Carlo Principle If we can sample many times from the density, f (θ), of a random variable, θ, we can learn anything we want to know about any computable function of that variable E(θ) = θf (θ)dθ What if this integral is tricky to compute, but we can sample from f (θ)? Sample θ (t) for t = 1, 2,, T draws from f (θ) T t=1 θ(t) /T θf (θ)dθ as T

Example: Sums of Random Normal Variables x 1 N (µ 1, σ 1 ) x 2 N (µ 2, σ 2 ) What s the mean of y = x 1 + x 2 What s the SD? y N (µ 1 + µ 2, σ 2 1 + σ2 2 ) Simulate: > T <- 10000 > MEAN <- c(-3, 3) > SD <- c(2, 4) > x1 <- rnorm(t, MEAN[1], SD[1]) > x2 <- rnorm(t, MEAN[2], SD[2]) > mean(x1 + x2) [1] 001308404 > sd(x1 + x2) [1] 4476329

Example: Sums of Random Normal Variables x 1 N (µ 1, σ 1 ) x 2 N (µ 2, σ 2 ) What s the mean of y = x 1 + x 2 What s the SD? y N (µ 1 + µ 2, σ 2 1 + σ2 2 )

The Method of Composition Say you have a vector of n random variables θ = θ 1, θ 2,, θ n and data vector y The joint posterior density of the random variables is f (θ y) You re interested in the marginal posterior density f (θ 1 y) = f (θ y)dθ 1 = f (θ 1 θ 1, y)f (θ 1 y)dθ 1 If you can sample from f (θ 1 θ 1, y) and f (θ 1 y), then you can simulate from the marginal density f (θ 1 y) for each t 1, 2, T do 1 sample θ (t) 1 from f (θ 1 y) 2 sample θ (t) 1 from f (θ 1 θ (t) 1, y) θ (t) 1 f (θ 1 y)

The Method of Composition

Anatomy of a V-Dem Indicator We want to measure continuous latent traits (matrix of random variables, Z) Coders have varying thresholds and reliabilities (coder parameters, vector of random variables, ϕ) Multiple coders per country-year provide observations on an ordinal scale (the data matrix, R) We use MCMC methods to simulate latent traits from the marginal posterior density f (Z R) We obtain a sample Z (1), Z (2),, Z (T ) : Afghanistan 1900 z (1) 11 z (2) 11 z (T ) 11 Afghanistan 1901 z (1) 12 z (2) 12 z (T ) 12

V-Dem C Indicators are Random Variables Known only up to a density function Working with point estimates throws out information This is true for both right and left-hand-side variables Hard to predict how measurement uncertainty will affect inferences Cross-correlations in draws Correlations with other variables may be robust across density Standard errors in variables (EIV) model addresses a related, but different issue Data points in EIV model are country-years in R Our measurement model addresses the EIV problem, while relaxing EIV assumptions about bias (a little bit) V-Dem point estimates are best estimates of latent values, but one shouldn t throw out our uncertainty around those estimates

Raw Data (R)

Threshold Bias Can the head of government propose legislation in practice?

Modeling Framework: Core Assumptions Multi-rater ordinal probit / ordinal item response theory model Rater perceptions equal reality + error Raters exhibit arbitrary thresholds Given thresholds, raters are correct on average (unbiased) Both thresholds and reliability vary across raters

Fitting Regressions with V-Dem Posterior Samples Goal: For an arbitrary regression model, estimate the marginal posterior density of the coefficient vector β, using V-Dem data, taking measurement uncertainty into account Sample from the joint posterior density f (β, Z Y, R), using the method of composition β is a vector of model coefficients Y is a matrix of data measured without uncertainty We can take advantage of the decomposition f (β, Z Y, R) = f (β Z, Y, R)f (Z R, Y) Assume: f (β Z, Y, R) = f (β Z, Y), f (Z R, Y) = f (Z R) We can now rewrite the decomposition as f (β, Z Y, R) = f (β Z, Y)f (Z R)

Applying the Method of Composition f (β, Z Y, R) = f (β Z, Y)f (Z R) We want the marginal distribution of β, f (β Y) Remember: f (θ 1 y) = f (θ y)dθ 1 = f (θ 1 θ 1, y)f (θ 1 y)dθ 1 So, given our assumptions: f (β Y) = f (β Z, Y)f (Z R)dZ The V-Dem modeling team already simulated T=900 draws where Z (t) f (Z R) [first MoC step] Regression coefficients are distributed N (µ, Σ) To apply the method of composition, for each t 1, 2,, T : 1 Fit your arbitrary regression model to data Y and Z (t), yielding partial likelihood estimates ˆµ (t) and ˆΣ (t) 2 Sample β (t) N (ˆµ (t), ˆΣ (t) )

V-Dem Example infant mortality cy = free discussion women cy + ln(gdppc) cy Afghanistan 1900 z 11 z (1) 11 z (2) 11 z (T ) 11 Afghanistan 1901 z 12 z (1) 12 z (2) 12 z (T ) 12 1 Partial regression, point estimate for z (discussion women)

V-Dem Example infant mortality cy = free discussion women cy + ln(gdppc) cy Afghanistan 1900 z 11 z (1) 11 z (2) 11 z (T ) 11 Afghanistan 1901 z 12 z (1) 12 z (2) 12 z (T ) 12 2 Fit model using a draw from the marginal density of z

V-Dem Example infant mortality cy = free discussion women cy + ln(gdppc) cy Afghanistan 1900 z 11 z (1) 11 z (2) 11 z (T ) 11 Afghanistan 1901 z 12 z (1) 12 z (2) 12 z (T ) 12 3 Sample β (t) N (ˆµ (t), ˆΣ (t) )

V-Dem Example infant mortality cy = free discussion women cy + ln(gdppc) cy Afghanistan 1900 z 11 z (1) 11 z (2) 11 z (T ) 11 Afghanistan 1901 z 12 z (1) 12 z (2) 12 z (T ) 12