The Temperature Proxy Controversy

Similar documents
THE HOCKEY STICK CONTROVERSY AND PCA

Surface Temperature Reconstructions for the Last 2,000 Years. Statement of

Climate Change and Global Warming

Probabilistic Models of Past Climate Change

Regression, Ridge Regression, Lasso

Model Selection. Frank Wood. December 10, 2009

XV. Understanding recent climate variability

Generalized Elastic Net Regression

Reply to Huybers. Stephen McIntyre Adelaide St. West, Toronto, Ontario Canada M5H 1T1

Bayesian methods in economics and finance

DISCUSSION OF: A STATISTICAL ANALYSIS OF MULTIPLE TEMPERATURE PROXIES: ARE RECONSTRUCTIONS OF SURFACE TEMPERATURES OVER THE LAST 1000 YEARS RELIABLE?

Historical Changes in Climate

Climate Change. Unit 3

Machine Learning Linear Classification. Prof. Matteo Matteucci

Antarctic sea ice changes natural or anthropogenic? Will Hobbs

(Directions for Excel Mac: 2011) Most of the global average warming over the past 50 years is very likely due to anthropogenic GHG increases

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2013

DEPARTMENT OF COMPUTER SCIENCE Autumn Semester MACHINE LEARNING AND ADAPTIVE INTELLIGENCE

Prof. Dr. Anders Levermann Junior Professor for climate modelling on long timescales, Potsdam Institute for Climate Impact Research, Potsdam, Germany

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Statistical Inference

Andriy Mnih and Ruslan Salakhutdinov

20: Gaussian Processes

Least Squares Regression

Lecture 14: Shrinkage

CSC2541 Lecture 2 Bayesian Occam s Razor and Gaussian Processes

Lecture 9. Time series prediction

Linear Model Selection and Regularization

Least Squares Regression

Advanced Introduction to Machine Learning CMU-10715

A HIERARCHICAL MODEL FOR REGRESSION-BASED CLIMATE CHANGE DETECTION AND ATTRIBUTION

STA414/2104 Statistical Methods for Machine Learning II

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

CPSC 340: Machine Learning and Data Mining. More PCA Fall 2017

NSF Expeditions in Computing. Understanding Climate Change: A Data Driven Approach. Vipin Kumar University of Minnesota

An automatic report for the dataset : affairs

MLR Model Selection. Author: Nicholas G Reich, Jeff Goldsmith. This material is part of the statsteachr project

Time Series Analysis. James D. Hamilton PRINCETON UNIVERSITY PRESS PRINCETON, NEW JERSEY

Lecture 5: GPs and Streaming regression

Probabilities, Uncertainties & Units Used to Quantify Climate Change

10-810: Advanced Algorithms and Models for Computational Biology. Optimal leaf ordering and classification

Week 3: Linear Regression

LECTURE NOTE #3 PROF. ALAN YUILLE

BAGUS: Bayesian Regularization for Graphical Models with Unequal Shrinkage

Using the climate of the past to predict the climate of the future. Danny McCarroll Swansea

Post-Selection Inference

Resampling techniques for statistical modeling

In this chapter, we provide an introduction to covariate shift adaptation toward machine learning in a non-stationary environment.

Comment on Article by Scutari

Confronting Climate Change in the Great Lakes Region. Technical Appendix Climate Change Projections CLIMATE MODELS

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

UNIVERSITY of PENNSYLVANIA CIS 520: Machine Learning Final, Fall 2014

IMPACTS OF A WARMING ARCTIC

Machine Learning Linear Regression. Prof. Matteo Matteucci

Global Climate Change

Observed Climate Variability and Change: Evidence and Issues Related to Uncertainty

scrna-seq Differential expression analysis methods Olga Dethlefsen NBIS, National Bioinformatics Infrastructure Sweden October 2017

Introduction. Chapter 1

Overfitting, Bias / Variance Analysis

GWAS IV: Bayesian linear (variance component) models

NOWCASTING THE NEW TURKISH GDP

The prediction of house price

J. Sadeghi E. Patelli M. de Angelis

Variable Selection in Predictive Regressions

STA 4273H: Sta-s-cal Machine Learning

Linear Regression Models. Based on Chapter 3 of Hastie, Tibshirani and Friedman

HURRICANES AND CLIMATE

Bayesian Linear Regression. Sargur Srihari

Testing Restrictions and Comparing Models

Introduction to Machine Learning

Bayesian Machine Learning

Global Climate Change - What evidence have scientists collected to support global climate change? Why? Model 1 Earth s Surface Temperatures.

CSC 576: Variants of Sparse Learning

Development of Stochastic Artificial Neural Networks for Hydrological Prediction

Linear Models for Regression

COS513: FOUNDATIONS OF PROBABILISTIC MODELS LECTURE 9: LINEAR REGRESSION

Nonparametric Bayesian Methods - Lecture I

Climate change: How do we know?

These slides follow closely the (English) course textbook Pattern Recognition and Machine Learning by Christopher Bishop

Machine Learning Approaches to Crop Yield Prediction and Climate Change Impact Assessment

Dimension Reduction Methods

Least Absolute Shrinkage is Equivalent to Quadratic Penalization

Creating Non-Gaussian Processes from Gaussian Processes by the Log-Sum-Exp Approach. Radford M. Neal, 28 February 2005

Machine learning comes from Bayesian decision theory in statistics. There we want to minimize the expected value of the loss function.

1. Non-Uniformly Weighted Data [7pts]

Robust Inverse Covariance Estimation under Noisy Measurements

Hierarchical Bayesian Modeling and Analysis: Why and How

Gaussian Processes. Le Song. Machine Learning II: Advanced Topics CSE 8803ML, Spring 2012

Relevance Vector Machines for Earthquake Response Spectra

COMP 551 Applied Machine Learning Lecture 3: Linear regression (cont d)

NATS 101 Section 13: Lecture 32. Paleoclimate

A Bayesian Perspective on Residential Demand Response Using Smart Meter Data

REPLY TO AMMANN AND WAHL: COMMENT ON HOCKEY STICKS, PRINCIPAL COMPONENTS AND SPURIOUS SIGNIFICANCE

ECONOMETRICS II, FALL Testing for Unit Roots.

Generalization Bounds

Data assimilation with and without a model

2. There may be large uncertainties in the dating of materials used to draw timelines for paleo records.

CPSC 540: Machine Learning

Statistics 262: Intermediate Biostatistics Model selection

Nowcasting Norwegian GDP

Transcription:

School of Mathematics February 8, 2012

Overview Introduction 1 Introduction 2 3 4

A provocative statement Figures often beguile me, particularly when I have the arranging of them myself; in which case the remark attributed to Disraeli would often apply with justice and force: There are three kinds of lies: lies, damned lies, and statistics. Chapters from my Autobiography, Mark Twain, 1906

Another provocative statement... it is likely that the 1990s have been the warmest decade and 1998 the warmest year of the millenium. Climate Change 2001: The Scientific Basis, Intergovernmental Panel on Climate Change. (IPCC2001)

Another provocative statement... it is likely that the 1990s have been the warmest decade and 1998 the warmest year of the millenium. Climate Change 2001: The Scientific Basis, Intergovernmental Panel on Climate Change. (IPCC2001) What is the scientific basis for this statement?

Another provocative statement... it is likely that the 1990s have been the warmest decade and 1998 the warmest year of the millenium. Climate Change 2001: The Scientific Basis, Intergovernmental Panel on Climate Change. (IPCC2001) What is the scientific basis for this statement? How do we know there wasn t a similar temperature spike in the 1190s or the 1530s?

The famous Hockey Stick

Some quick remarks The blue curve is a weighted average of temperature histories, conditioned on the data; The gray region only indicates pointwise confidence intervals; Known temperatures from the past 150 years ( the instrumental period ) were used as training data for the statistical model;

Some quick remarks The blue curve is a weighted average of temperature histories, conditioned on the data; The gray region only indicates pointwise confidence intervals; Known temperatures from the past 150 years ( the instrumental period ) were used as training data for the statistical model; The first two features tend to downplay the likelihood that temperature spikes might have occurred prior to 1850.

Some quick remarks The blue curve is a weighted average of temperature histories, conditioned on the data; The gray region only indicates pointwise confidence intervals; Known temperatures from the past 150 years ( the instrumental period ) were used as training data for the statistical model; The first two features tend to downplay the likelihood that temperature spikes might have occurred prior to 1850. The third feature makes the model seem better than it is at finding such spikes.

Overview Introduction 1 Introduction 2 3 4

The general method Data from tree rings, ice cores, coral, lake sediments, etc., are proxies for the actual local temperatures. The data is gathered from many places and many time periods. The number of different proxies is p 1200, but only about 100 are available for the entire millennium.

The general method Data from tree rings, ice cores, coral, lake sediments, etc., are proxies for the actual local temperatures. The data is gathered from many places and many time periods. The number of different proxies is p 1200, but only about 100 are available for the entire millennium. Scientific theories lead to statistical models for the relationship between proxy measurements and temperatures (fancy linear regression).

The general method Data from tree rings, ice cores, coral, lake sediments, etc., are proxies for the actual local temperatures. The data is gathered from many places and many time periods. The number of different proxies is p 1200, but only about 100 are available for the entire millennium. Scientific theories lead to statistical models for the relationship between proxy measurements and temperatures (fancy linear regression). Model parameters are tuned using known temperatures from the past n = 150 years

It s an art Introduction Tuned model backcasts temps to time period 1000-1850.

It s an art Introduction Tuned model backcasts temps to time period 1000-1850. This is not a standard statistical problem, and there are many competing approaches, most of them attempting to deal with the fact that p n.

It s an art Introduction Tuned model backcasts temps to time period 1000-1850. This is not a standard statistical problem, and there are many competing approaches, most of them attempting to deal with the fact that p n. Two approaches are: (i) Select a small number of principal components ; (ii) The Lasso, which is a regression that penalizes the inclusion of more covariates.

Selection, validation, comparison

Selection, validation, comparison A common way to validate a method or model, or to compare two of them, is block holdout RMSE: block(s) of the training data are withheld during the tuning stage, and then the tuned model is used to predict the held out temperatures and the RMSE is calculated.

Selection, validation, comparison A common way to validate a method or model, or to compare two of them, is block holdout RMSE: block(s) of the training data are withheld during the tuning stage, and then the tuned model is used to predict the held out temperatures and the RMSE is calculated. In selecting and validating the model that produced the Hockey Stick, two blocks were held out: the first 50 years and the last 50 years, so the tuning was based on the middle 50 years.

The three blocks This was done to favor models that are good at extrapolation.

The three blocks This was done to favor models that are good at extrapolation. But let s think about this for a minute...

The three blocks This was done to favor models that are good at extrapolation. But let s think about this for a minute... 1850-1900 1900-1950 1950-2000

Overview Introduction 1 Introduction 2 3 4

Is there a Hockey Stick Bias? In Mann et al [1998], a slightly non-standard method was used involving a single principal component, and it was later shown that this method has a Hockey Stick Bias.

Is there a Hockey Stick Bias? In Mann et al [1998], a slightly non-standard method was used involving a single principal component, and it was later shown that this method has a Hockey Stick Bias. In a subsequent paper, Mann et al [2004], the block holdout method was used to decide on the number of principal components, and the Hockey Stick (magically) reappeared.

Is there a Hockey Stick Bias? In Mann et al [1998], a slightly non-standard method was used involving a single principal component, and it was later shown that this method has a Hockey Stick Bias. In a subsequent paper, Mann et al [2004], the block holdout method was used to decide on the number of principal components, and the Hockey Stick (magically) reappeared. Mainstream statisticians were typically not involved (beyond making a few criticisms).

Enter: McShane and Wyner [2010] The most famous line in the paper is probably We find that the proxies do not predict temperature significantly better than random series generated independently of temperature.

Enter: McShane and Wyner [2010] The most famous line in the paper is probably We find that the proxies do not predict temperature significantly better than random series generated independently of temperature. Doubt is cast on the ability of the proxies to forecast high levels of and sharp run-up in temperature... if in fact they occurred several hundred years ago.

Enter: McShane and Wyner [2010] The most famous line in the paper is probably We find that the proxies do not predict temperature significantly better than random series generated independently of temperature. Doubt is cast on the ability of the proxies to forecast high levels of and sharp run-up in temperature... if in fact they occurred several hundred years ago. Other equally valid models produce extremely different historical backcasts.

McShane and Wyner [2010] (continued) They create their own Bayesian model. It produces a reconstruction of the past that is similar to the Hockey Stick, but it has a much wider error band, which reflects the weak signal and large uncertainty... in this setting.

McShane and Wyner [2010] (continued) They create their own Bayesian model. It produces a reconstruction of the past that is similar to the Hockey Stick, but it has a much wider error band, which reflects the weak signal and large uncertainty... in this setting. The Bayesian model produces a 36% posterior probability that 1998 was the warmest in the past thousand years.

McShane and Wyner [2010] (continued) They create their own Bayesian model. It produces a reconstruction of the past that is similar to the Hockey Stick, but it has a much wider error band, which reflects the weak signal and large uncertainty... in this setting. The Bayesian model produces a 36% posterior probability that 1998 was the warmest in the past thousand years. An entire issue of the Annals of Applied Statistics was devoted to this paper, 13 responses, and a rejoinder.

Overview Introduction 1 Introduction 2 3 4

Assumptions ( Best Case Scenario ) The data set used by Mann et al is reliable.

Assumptions ( Best Case Scenario ) The data set used by Mann et al is reliable. The basic modeling assumptions used by Mann et al and other climate scientists are acceptable. These are that the relationships between the proxies and the temperatures are approximately linear and stationary.

Assumptions ( Best Case Scenario ) The data set used by Mann et al is reliable. The basic modeling assumptions used by Mann et al and other climate scientists are acceptable. These are that the relationships between the proxies and the temperatures are approximately linear and stationary. Block holdout RMSE can be used to evaluate models, but not necessarily just the first and last thirds of 1850-2000.

The Mainstream Statisticians Touch They introduce several interesting null models to provide benchmarks for the predictive strength of the proxies

The Mainstream Statisticians Touch They introduce several interesting null models to provide benchmarks for the predictive strength of the proxies To evaluate models, McShane and Wyner calculate the holdout RMSE for all 120 possible 30-year blocks of data from 1850-2000.

The Mainstream Statisticians Touch They introduce several interesting null models to provide benchmarks for the predictive strength of the proxies To evaluate models, McShane and Wyner calculate the holdout RMSE for all 120 possible 30-year blocks of data from 1850-2000. For regression, they use the so-called Lasso method, considered to be a reasonable procedure that has proven powerful, fast, and popular, and performs comparably well in a p n context.

The null models The null models in this paper are of two types: Those that are based strictly on the temperatures from 1850-2000, the simplest one being to take the mean temperature from that time period;

The null models The null models in this paper are of two types: Those that are based strictly on the temperatures from 1850-2000, the simplest one being to take the mean temperature from that time period; Those that replace the proxy data by pseudo-proxies. These are randomly generated time series, independent of the historical record. One example is to replace each proxy time series with a Brownian motion.

The most interesting null model One pseudo-proxy time series is called the Empirical AR1 Model.

The most interesting null model One pseudo-proxy time series is called the Empirical AR1 Model. An AR1 time series is of the form X t = φx t 1 + ɛ t, where 0 φ 1 is a parameter and the ɛ t are independent standard Gaussian random variables.

The most interesting null model One pseudo-proxy time series is called the Empirical AR1 Model. An AR1 time series is of the form X t = φx t 1 + ɛ t, where 0 φ 1 is a parameter and the ɛ t are independent standard Gaussian random variables. For each proxy, a parameter φ i, i = 1,..., 1200, is estimated for each proxy series.

The most interesting null model One pseudo-proxy time series is called the Empirical AR1 Model. An AR1 time series is of the form X t = φx t 1 + ɛ t, where 0 φ 1 is a parameter and the ɛ t are independent standard Gaussian random variables. For each proxy, a parameter φ i, i = 1,..., 1200, is estimated for each proxy series. Then the 1200 pseudo-proxy time series are generated independently, using these values of φ i.

The most interesting null model One pseudo-proxy time series is called the Empirical AR1 Model. An AR1 time series is of the form X t = φx t 1 + ɛ t, where 0 φ 1 is a parameter and the ɛ t are independent standard Gaussian random variables. For each proxy, a parameter φ i, i = 1,..., 1200, is estimated for each proxy series. Then the 1200 pseudo-proxy time series are generated independently, using these values of φ i. Also considered are the cases φ i 1 (Brownian motion), φ i 0 (White noise), and φ i.25,.4.

The results Introduction

The results Introduction

A different way of using pseudo-proxies An alternative method for determining the predictive value of the proxies is to create a data set consisting of the 1200 proxy time series and 1200 independently generated pseudo-proxy time series. Then perform a regression on the expanded data set, using the Lasso. In general, the Lasso selects a relatively small subset of the covariates, based on their apparent predictive value during the training phase. So it is interesting to see whether the Lasso selects true proxies more often than pseudo-proxies.

The results Introduction

Comparing several proxy models McShane and Wyner also looked at the backcasts produced by several different regression models, based only on the roughly 100 proxies for which data is available over the entire millennium.

Comparing several proxy models McShane and Wyner also looked at the backcasts produced by several different regression models, based only on the roughly 100 proxies for which data is available over the entire millennium. There were 27 models in all, including the Lasso, stepwise regression, applied to the full proxy series and to principal components of the proxies, ordinary regression applied to principal components, and two-stage models involving local temperatures.

A variety of backcasts Three backcasts, using models with very similar (relatively good) RMSE: PC1 (red), PC10 (green), Two-stage (blue)

The Bayesian Model