Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment

Similar documents
The open-faced sandwich adjustment for MCMC using estimating functions

Models for Spatial Extremes. Dan Cooley Department of Statistics Colorado State University. Work supported in part by NSF-DMS

Bayesian inference for multivariate extreme value distributions

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

CSC 2541: Bayesian Methods for Machine Learning

Extreme Value Analysis and Spatial Extremes

Statistical Methods in Particle Physics

Bayesian Inference for Clustered Extremes

New Bayesian methods for model comparison

Principles of Bayesian Inference

Bayes: All uncertainty is described using probability.

Principles of Bayesian Inference

Lecture 13 Fundamentals of Bayesian Inference

Part III. A Decision-Theoretic Approach and Bayesian testing

Bayesian Inference in GLMs. Frequentists typically base inferences on MLEs, asymptotic confidence

Bayesian Regression Linear and Logistic Regression

Principles of Bayesian Inference

Some Curiosities Arising in Objective Bayesian Analysis

DS-GA 1003: Machine Learning and Computational Statistics Homework 7: Bayesian Modeling

Theory of Maximum Likelihood Estimation. Konstantin Kashin

Nested Sampling. Brendon J. Brewer. brewer/ Department of Statistics The University of Auckland

Introduction to Bayesian Inference

Bayesian Estimation of DSGE Models 1 Chapter 3: A Crash Course in Bayesian Inference

Bayesian Modelling of Extreme Rainfall Data

Likelihood-free MCMC

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

Stat 535 C - Statistical Computing & Monte Carlo Methods. Arnaud Doucet.

Statistical Methods in Particle Physics Lecture 1: Bayesian methods

Eco517 Fall 2014 C. Sims MIDTERM EXAM

Notes on pseudo-marginal methods, variational Bayes and ABC

Module 22: Bayesian Methods Lecture 9 A: Default prior selection

Introduction to Bayesian Methods

Accommodating measurement scale uncertainty in extreme value analysis of. northern North Sea storm severity

A primer on Bayesian statistics, with an application to mortality rate estimation

Quantile POD for Hit-Miss Data

Assessing Regime Uncertainty Through Reversible Jump McMC

Parameter estimation and forecasting. Cristiano Porciani AIfA, Uni-Bonn

RISK AND EXTREMES: ASSESSING THE PROBABILITIES OF VERY RARE EVENTS

Parameter Estimation. William H. Jefferys University of Texas at Austin Parameter Estimation 7/26/05 1

Calibrating general posterior credible regions

Statistical Methods in Particle Physics Lecture 2: Limits and Discovery

Two examples of the use of fuzzy set theory in statistics. Glen Meeden University of Minnesota.

Minimum Message Length Analysis of the Behrens Fisher Problem

An introduction to Bayesian statistics and model calibration and a host of related topics

A spatio-temporal model for extreme precipitation simulated by a climate model

MIT Spring 2016

STAT 830 Bayesian Estimation

Stable Limit Laws for Marginal Probabilities from MCMC Streams: Acceleration of Convergence

LECTURE 5 NOTES. n t. t Γ(a)Γ(b) pt+a 1 (1 p) n t+b 1. The marginal density of t is. Γ(t + a)γ(n t + b) Γ(n + a + b)

Spring 2017 Econ 574 Roger Koenker. Lecture 14 GEE-GMM

Bayesian Learning. HT2015: SC4 Statistical Data Mining and Machine Learning. Maximum Likelihood Principle. The Bayesian Learning Framework

Statistical Inference: Maximum Likelihood and Bayesian Approaches

Session 2B: Some basic simulation methods

Practical Bayesian Optimization of Machine Learning. Learning Algorithms

Bayesian Methods in Multilevel Regression

Department of Statistics

Model comparison. Christopher A. Sims Princeton University October 18, 2016

Physics 403. Segev BenZvi. Credible Intervals, Confidence Intervals, and Limits. Department of Physics and Astronomy University of Rochester

The Mixture Approach for Simulating New Families of Bivariate Distributions with Specified Correlations

Bayesian Dropout. Tue Herlau, Morten Morup and Mikkel N. Schmidt. Feb 20, Discussed by: Yizhe Zhang

Bayesian Semiparametric GARCH Models

Introduction to Bayesian Statistics and Markov Chain Monte Carlo Estimation. EPSY 905: Multivariate Analysis Spring 2016 Lecture #10: April 6, 2016

Bayesian Semiparametric GARCH Models

Practical Bayesian Quantile Regression. Keming Yu University of Plymouth, UK

ICES REPORT Model Misspecification and Plausibility

Preliminary Statistics Lecture 2: Probability Theory (Outline) prelimsoas.webs.com

Choosing among models

A BAYESIAN MATHEMATICAL STATISTICS PRIMER. José M. Bernardo Universitat de València, Spain

Linear regression example Simple linear regression: f(x) = ϕ(x)t w w ~ N(0, ) The mean and covariance are given by E[f(x)] = ϕ(x)e[w] = 0.

On Bayesian Computation

Bayesian Econometrics

David Giles Bayesian Econometrics

Empirical Risk Minimization is an incomplete inductive principle Thomas P. Minka

Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, NC

Bayesian Analysis (Optional)

E. Santovetti lesson 4 Maximum likelihood Interval estimation

Modelling Operational Risk Using Bayesian Inference

Comparing Non-informative Priors for Estimation and Prediction in Spatial Models

Basic math for biology

HIERARCHICAL MODELS IN EXTREME VALUE THEORY

Multivariate Bayesian Linear Regression MLAI Lecture 11

A Review of Pseudo-Marginal Markov Chain Monte Carlo

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Bayesian philosophy Bayesian computation Bayesian software. Bayesian Statistics. Petter Mostad. Chalmers. April 6, 2017

Econometrics I, Estimation

Bayesian Inference for DSGE Models. Lawrence J. Christiano

Introduction to Probability and Statistics (Continued)

Bayesian Inference. Chapter 1. Introduction and basic concepts

Monte Carlo in Bayesian Statistics

F denotes cumulative density. denotes probability density function; (.)

A Bayesian perspective on GMM and IV

Probability and Estimation. Alan Moses

Bayesian Modeling of Accelerated Life Tests with Random Effects

Inconsistency of Bayesian inference when the model is wrong, and how to repair it

Introduction into Bayesian statistics

MODEL COMPARISON CHRISTOPHER A. SIMS PRINCETON UNIVERSITY

Bayesian Statistical Methods. Jeff Gill. Department of Political Science, University of Florida

Can we do statistical inference in a non-asymptotic way? 1

Introduction to Probabilistic Machine Learning

Geostatistics of Extremes

Transcription:

Approximate Bayesian computation for spatial extremes via open-faced sandwich adjustment Ben Shaby SAMSI August 3, 2010 Ben Shaby (SAMSI) OFS adjustment August 3, 2010 1 / 29

Outline 1 Introduction 2 Spatial Extremes 3 The OFS adjustment 4 Simulation Ben Shaby (SAMSI) OFS adjustment August 3, 2010 2 / 29

Outline 1 Introduction 2 Spatial Extremes 3 The OFS adjustment 4 Simulation Ben Shaby (SAMSI) OFS adjustment August 3, 2010 3 / 29

Posterior distributions I will describe how to draw from a posterior distribution. Note the finger quotes. What do we want out of a posterior distribution? Ben Shaby (SAMSI) OFS adjustment August 3, 2010 4 / 29

Posterior distributions I will describe how to draw from a posterior distribution. Note the finger quotes. What do we want out of a posterior distribution? For our purposes, we will want a distribution that 1 Describes our state of knowledge (uncertainty) about a parameter. 2 Produces equi-tailed credible intervals that have nominal frequentist coverage rates. This is not a very Bayesian view! Ben Shaby (SAMSI) OFS adjustment August 3, 2010 4 / 29

A good posterior An ideal ''posterior'' empirical density of θ^ π(θ x) Ben Shaby (SAMSI) OFS adjustment August 3, 2010 5 / 29

Outline 1 Introduction 2 Spatial Extremes 3 The OFS adjustment 4 Simulation Ben Shaby (SAMSI) OFS adjustment August 3, 2010 6 / 29

Extreme values Of the environmental variables we care about, usually what we really care about are the extremes. heat waves storms sea levels Why do we care? manage risk (insurance, etc.) emergency preparedness Ben Shaby (SAMSI) OFS adjustment August 3, 2010 7 / 29

Floods Ben Shaby (SAMSI) OFS adjustment August 3, 2010 8 / 29

Heat waves Ben Shaby (SAMSI) OFS adjustment August 3, 2010 9 / 29

Extreme values Extreme values can mean many things We consider only block maxima (block minima). Asymptotically follow generalized extreme value (GEV) distribution { [ ( x η )] } 1/ξ G(x) = exp 1 ξ τ + where z + = max(z, 0). η is a location and τ a scale parameter. ξ is a shape parameter, and determines the tail behavior. Then any maximal process should have GEV marginal distributions! Ben Shaby (SAMSI) OFS adjustment August 3, 2010 10 / 29

Spatial extremes It is possible to construct processes with spatial structure and GEV marginals. This leads us to max stable processes. The Smith model is one example. Z(x) 0.0 0.1 0.2 0.3 0.4 0 2 4 6 8 10 x Ben Shaby (SAMSI) OFS adjustment August 3, 2010 11 / 29

Smith process in 2 dimensions Ben Shaby (SAMSI) OFS adjustment August 3, 2010 12 / 29

GP margins Lest you fear that this process is unrealistic, the margins don t have to be the same everywhere. Unit Frechet margins Gaussian process GEV parameters Ben Shaby (SAMSI) OFS adjustment August 3, 2010 13 / 29

Pairwise likelihoods Unfortunately, joint likelihoods for the Smith process are not known for n 2. But we can write the pairwise likelihood, a form of composite likelihood. L p (θ; y) = i j f(y i, y j ; θ) Ben Shaby (SAMSI) OFS adjustment August 3, 2010 14 / 29

Pairwise likelihoods Unfortunately, joint likelihoods for the Smith process are not known for n 2. But we can write the pairwise likelihood, a form of composite likelihood. L p (θ; y) = i j f(y i, y j ; θ) It turns out that L p (θ; y) that behaves similarly to the likelihood. Can we trick MCMC into doing something useful with L p (θ; y)? Ben Shaby (SAMSI) OFS adjustment August 3, 2010 14 / 29

The quasi-posterior Yes! We define the quasi-posterior distribution as π p,n (θ y n ) = L p,n (θ; y n )π(θ) Θ L p,n(θ; y n )π(θ) dθ, We will assume, for convenience, that π(θ) proper. L p,n is not necessarily a density, so π p,n (θ y n ) is not a true posterior. L p,n is integrable, so as long as the prior π(θ) is proper, then π p,n (θ Z n ) will be a proper density. Ben Shaby (SAMSI) OFS adjustment August 3, 2010 15 / 29

More definitions Now we can write down a quasi-bayes estimator Define loss in the usual way. Define quasi-posterior risk R n (θ) as the quasi-posterior expectation of loss. The pairwise quasi-bayes estimator is then ˆθ QB = argmin R n (θ). θ Θ Ben Shaby (SAMSI) OFS adjustment August 3, 2010 16 / 29

Outline 1 Introduction 2 Spatial Extremes 3 The OFS adjustment 4 Simulation Ben Shaby (SAMSI) OFS adjustment August 3, 2010 17 / 29

The sandwich matrix P n = E 0 [ 0 l p,n 0 l p,n] B n = E 0 [ 2 0l p,n ] S n = B n P 1 n B n Bread Ben Shaby (SAMSI) OFS adjustment August 3, 2010 18 / 29

The sandwich matrix P n = E 0 [ 0 l p,n 0 l p,n] B n = E 0 [ 2 0l p,n ] S n = B n P 1 n B n Bread Peanut butter Ben Shaby (SAMSI) OFS adjustment August 3, 2010 18 / 29

The sandwich matrix P n = E 0 [ 0 l p,n 0 l p,n] B n = E 0 [ 2 0l p,n ] S n = B n P 1 n B n Bread Peanut butter Bread Ben Shaby (SAMSI) OFS adjustment August 3, 2010 18 / 29

Asymptotic normality of quasi-bayes estimators Then as long as we don t use a crazy prior, Chernozhukov and Hong (2003) says that: Theorem S 1/2 n (ˆθ QB θ 0 ) D N(0, I) When we use pairwise likelihoods for MCMC, the sandwich matrix describes the (asymptotic) sampling variability of the estimator. Ben Shaby (SAMSI) OFS adjustment August 3, 2010 19 / 29

Convergence of the quasi-posterior Furthermore, (also from Chernozhukov and Hong, 2003) Theorem Asymptotically, π p,n (θ y n ) N(θ 0, B 1 n ). This has important consequences for inference from the MCMC sample! B 1 B 1 PB 1! Equi-tailed credible intervals based on MCMC quantiles will NOT have the correct frequentist coverage probabilities Ben Shaby (SAMSI) OFS adjustment August 3, 2010 20 / 29

Distortion of the posterior The two curves are very different! Density 0 2 4 6 8 empirical density of θ^ π p (θ x) 0.5 1.0 1.5 2.0 θ Ben Shaby (SAMSI) OFS adjustment August 3, 2010 21 / 29

The OFS adjustment The main idea: Whereas ˆθ QB is distributed like a sandwich normal (S 1 n ), the quasi-posterior looks like a single slice of bread normal (B 1 n ). We want to complete the sandwich by joining the slice of bread B 1 n to the open-faced sandwich B n P 1 n to get S 1 n. Ben Shaby (SAMSI) OFS adjustment August 3, 2010 22 / 29

The OFS adjustment The main idea: Whereas ˆθ QB is distributed like a sandwich normal (S 1 n ), the quasi-posterior looks like a single slice of bread normal (B 1 n ). We want to complete the sandwich by joining the slice of bread B 1 n to the open-faced sandwich B n P 1 n to get S 1 n. Ben Shaby (SAMSI) OFS adjustment August 3, 2010 22 / 29

The OFS adjustment The trick: Let Ω = B 1 P 1/2 B 1/2, the (OFS) adjustment matrix. Take samples from π p (θ y) obtained via MCMC and pre-multiply them (after centering) by an estimator ˆΩ of Ω If everything goes according to plan, if you squint a bit, each (centered) sample Z N(0, B 1 ), making the transformed sample Z = ΩZ N(0, S 1 ). So we should end up with a sample that has the right frequentist properties. Ben Shaby (SAMSI) OFS adjustment August 3, 2010 23 / 29

Outline 1 Introduction 2 Spatial Extremes 3 The OFS adjustment 4 Simulation Ben Shaby (SAMSI) OFS adjustment August 3, 2010 24 / 29

Simulated data I simulated 1000 datasets, each with y Smith process(σ) Unit Frechet margins [ ] 0.75 0.5 Σ = 0.5 1.25 100 spatial locations 100 blocks Ben Shaby (SAMSI) OFS adjustment August 3, 2010 25 / 29

MCMC with OFS For each realization of y, we run MCMC using the pairwise likelihood. The OFS matrix is constructed via the four combinations of: 1 ˆP a Monte Carlo estimate of the expected information at θ0 2 ˆP a moment estimate of the expected information at ˆθ 3 ˆB the sample covariance of the MCMC sample 4 ˆB the observed information at ˆθ Intervals are constructed as equi-tailed quantiles of the adjusted MCMC sample, and coverage rates computed. Ben Shaby (SAMSI) OFS adjustment August 3, 2010 26 / 29

Coverage rates Σ σ 11 Σ σ 12 Σ σ 22 coverage 0.0 0.2 0.4 0.6 0.8 1.0 coverage 0.0 0.2 0.4 0.6 0.8 1.0 coverage 0.0 0.2 0.4 0.6 0.8 1.0 0.0 0.2 0.4 0.6 0.8 1.0 nominal coverage 0.0 0.2 0.4 0.6 0.8 1.0 nominal coverage 0.0 0.2 0.4 0.6 0.8 1.0 nominal coverage Dashed lines are OFS-adjusted samples, solid line is un-adjusted. Ben Shaby (SAMSI) OFS adjustment August 3, 2010 27 / 29

Summary In summary: Max stable processes are useful for modeling spatial extremes, but their corresponding joint densities are unavailable. One can construct a quasi-posterior using pairwise likelihoods, but The quasi posterior does not reflect parameter uncertainty. Using OFS, we can adjust MCMC samples of the quasi posterior to have the properties we want. Ben Shaby (SAMSI) OFS adjustment August 3, 2010 28 / 29

Summary In summary: Max stable processes are useful for modeling spatial extremes, but their corresponding joint densities are unavailable. One can construct a quasi-posterior using pairwise likelihoods, but The quasi posterior does not reflect parameter uncertainty. Using OFS, we can adjust MCMC samples of the quasi posterior to have the properties we want. A few caveats: The OFS matrix can be difficult to estimate (in particular, the peanut butter center). This approach would really shine in hierarchical models, which I have not shown you. It s not really Bayesian. Ribatet et al. (2010) have a different approach to the same problem. Ben Shaby (SAMSI) OFS adjustment August 3, 2010 28 / 29

References Victor Chernozhukov and Han Hong. An MCMC approach to classical estimation. J. Econometrics, 115(2):293 346, 2003. ISSN 0304-4076. Mathieu Ribatet, Daniel Cooley, and Anthony Davison. Bayesian inference from composite likelihoods, with an application to spatial extremes. Extremes, 2010. to appear. Ben Shaby (SAMSI) OFS adjustment August 3, 2010 29 / 29