STATISTICS OF CLIMATE EXTREMES// TRENDS IN CLIMATE DATASETS

Similar documents
STATISTICS FOR CLIMATE SCIENCE

University of Oxford. Statistical Methods Autocorrelation. Identification and Estimation

CLIMATE EXTREMES AND GLOBAL WARMING: A STATISTICIAN S PERSPECTIVE

Forecasting using R. Rob J Hyndman. 2.4 Non-seasonal ARIMA models. Forecasting using R 1

Ch 6. Model Specification. Time Series Analysis

Circle the single best answer for each multiple choice question. Your choice should be made clearly.

Circle a single answer for each multiple choice question. Your choice should be made clearly.

Time Series 4. Robert Almgren. Oct. 5, 2009

Richard L. Smith Department of Statistics and Operations Research University of North Carolina Chapel Hill, NC

Time Series I Time Domain Methods

Part 1. Multiple Choice (50 questions, 1 point each) Part 2. Problems/Short Answer (10 questions, 5 points each)

STAT 436 / Lecture 16: Key

FE570 Financial Markets and Trading. Stevens Institute of Technology

Classical Decomposition Model Revisited: I

TIME SERIES ANALYSIS AND FORECASTING USING THE STATISTICAL MODEL ARIMA

at least 50 and preferably 100 observations should be available to build a proper model

of the 7 stations. In case the number of daily ozone maxima in a month is less than 15, the corresponding monthly mean was not computed, being treated

Chapter 12: An introduction to Time Series Analysis. Chapter 12: An introduction to Time Series Analysis

2. An Introduction to Moving Average Models and ARMA Models

Lecture 2: Univariate Time Series

4. MA(2) +drift: y t = µ + ɛ t + θ 1 ɛ t 1 + θ 2 ɛ t 2. Mean: where θ(l) = 1 + θ 1 L + θ 2 L 2. Therefore,

Basics: Definitions and Notation. Stationarity. A More Formal Definition

AR(p) + I(d) + MA(q) = ARIMA(p, d, q)

Some general observations.

A STATISTICAL APPROACH TO OPERATIONAL ATTRIBUTION

Overview of Extreme Value Analysis (EVA)

A Data-Driven Model for Software Reliability Prediction

Regression with correlation for the Sales Data

STAT Financial Time Series

Chapter 6: Model Specification for Time Series

Advanced Econometrics

SOME BASICS OF TIME-SERIES ANALYSIS

EASTERN MEDITERRANEAN UNIVERSITY ECON 604, FALL 2007 DEPARTMENT OF ECONOMICS MEHMET BALCILAR ARIMA MODELS: IDENTIFICATION

Uncertainty in Ranking the Hottest Years of U.S. Surface Temperatures

Problem Set 2 Solution Sketches Time Series Analysis Spring 2010

Prof. Dr. Roland Füss Lecture Series in Applied Econometrics Summer Term Introduction to Time Series Analysis

Modeling and forecasting global mean temperature time series

Econ 623 Econometrics II Topic 2: Stationary Time Series

STATISTICAL MODELS FOR QUANTIFYING THE SPATIAL DISTRIBUTION OF SEASONALLY DERIVED OZONE STANDARDS

HIERARCHICAL MODELS IN EXTREME VALUE THEORY

INFLUENCE OF CLIMATE CHANGE ON EXTREME WEATHER EVENTS

Time Series Models and Inference. James L. Powell Department of Economics University of California, Berkeley

5 Autoregressive-Moving-Average Modeling

1 Introduction to Generalized Least Squares

STOR 356: Summary Course Notes Part III

THE ROLE OF OCEAN STATE INDICES IN SEASONAL AND INTER-ANNUAL CLIMATE VARIABILITY OF THAILAND

ECON/FIN 250: Forecasting in Finance and Economics: Section 7: Unit Roots & Dickey-Fuller Tests

Empirical Market Microstructure Analysis (EMMA)

INTRODUCTION TO TIME SERIES ANALYSIS. The Simple Moving Average Model

Ch 5. Models for Nonstationary Time Series. Time Series Analysis

Probability and Statistics Notes

STAT 443 Final Exam Review. 1 Basic Definitions. 2 Statistical Tests. L A TEXer: W. Kong

Global Temperature Is Continuing to Rise: A Primer on Climate Baseline Instability. G. Bothun and S. Ostrander Dept of Physics, University of Oregon

Modelling using ARMA processes

ARIMA Models. Jamie Monogan. January 16, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 16, / 27

Chapter 5: Models for Nonstationary Time Series

Stat 5100 Handout #12.e Notes: ARIMA Models (Unit 7) Key here: after stationary, identify dependence structure (and use for forecasting)

Lecture 2 APPLICATION OF EXREME VALUE THEORY TO CLIMATE CHANGE. Rick Katz

The Identification of ARIMA Models

Short Questions (Do two out of three) 15 points each

Seasonal Climate Watch July to November 2018

Seasonal Climate Watch September 2018 to January 2019

MODELING INFLATION RATES IN NIGERIA: BOX-JENKINS APPROACH. I. U. Moffat and A. E. David Department of Mathematics & Statistics, University of Uyo, Uyo

RISK AND EXTREMES: ASSESSING THE PROBABILITIES OF VERY RARE EVENTS

Forecasting with ARIMA models This version: 14 January 2018

1 Random walks and data

A pragmatic view of rates and clustering

Delayed Response of the Extratropical Northern Atmosphere to ENSO: A Revisit *

MS&E 226: Small Data

Model Selection for Geostatistical Models

Statistics 910, #5 1. Regression Methods

Time Series 2. Robert Almgren. Sept. 21, 2009

ARIMA Models. Jamie Monogan. January 25, University of Georgia. Jamie Monogan (UGA) ARIMA Models January 25, / 38

Forecasting. Simon Shaw 2005/06 Semester II

Regression of Time Series

Time Series Analysis -- An Introduction -- AMS 586

Chapter 8: Model Diagnostics

Economics 536 Lecture 7. Introduction to Specification Testing in Dynamic Econometric Models

Chapter 3: Regression Methods for Trends

F9 F10: Autocorrelation

data lam=36.9 lam=6.69 lam=4.18 lam=2.92 lam=2.21 time max wavelength modulus of max wavelength cycle

ECON/FIN 250: Forecasting in Finance and Economics: Section 8: Forecast Examples: Part 1

If we want to analyze experimental or simulated data we might encounter the following tasks:

Climate Variability and El Niño

TMA4285 December 2015 Time series models, solution.

STAT 520: Forecasting and Time Series. David B. Hitchcock University of South Carolina Department of Statistics

FIN822 project 2 Project 2 contains part I and part II. (Due on November 10, 2008)

TIME SERIES DATA ANALYSIS USING EVIEWS

Forecasting using R. Rob J Hyndman. 3.2 Dynamic regression. Forecasting using R 1

A brief introduction to mixed models

Extreme Rainfall in the Southeast U.S.

MAT3379 (Winter 2016)

Review Session: Econometrics - CLEFIN (20192)

Separation of a Signal of Interest from a Seasonal Effect in Geophysical Data: I. El Niño/La Niña Phenomenon

Bayesian dynamic modeling for large space-time weather datasets using Gaussian predictive processes

AR, MA and ARMA models

Econometric Forecasting

A time series is called strictly stationary if the joint distribution of every collection (Y t

The log transformation produces a time series whose variance can be treated as constant over time.

2013 ATLANTIC HURRICANE SEASON OUTLOOK. June RMS Cat Response

Transcription:

STATISTICS OF CLIMATE EXTREMES// TRENDS IN CLIMATE DATASETS Richard L Smith Departments of STOR and Biostatistics, University of North Carolina at Chapel Hill and Statistical and Applied Mathematical Sciences Institute (SAMSI) SAMSI Graduate Class November 14, 2017 1 1

EXAMINE SST AS AN ALTERNATIVE COVARIATE Define Gulf Coast Region and 80 neighboring precipitation stations (Houston Hobby in black) Calculate monthly mean SST for Gulf Coast Region 2

3

ESTIMATES FOR HOUSTON HOBBY AIRPORT 4

1. Use 7-day annual maximum precipitation (also tried 3, 4, 5, 6, 8 days: similar results) 2. Fit linear trend in GEV location parameter: p 0.02 3. Examine all possible consecutive sequences of monthly SST lags from 0 12 months (91 possible covariates) 4. Best SST model uses means of lags 7, 8, 9, p 0.0007, but this raises an obvious selection bias issue 5. I could not come up with any argument why the SST analysis was better than the linear trend analysis after taking account of the selection bias issue 6. Therefore, try a combined analysis over many stations 5

Here is a table of the 10 best models ordered by the deviance-based P-value, with the SST lags that are included in each. Lags P-value 7,8,9 0.0007 7,8 0.0009 7,8,9,10 0.0015 5,6,7,8,9,10 0.0025 6,7,8,9,10 0.0030 6,7,8,9 0.0030 5,6,7,8,9 0.0031 4,5,6,7,8,9,10 0.0035 8 0.0038 3,4,5,6,7,8,9,10 0.0043 Conclusion: Lags 6 10 all feature in many models Difficult to quantify the effect of selection bias Benjamini-Hochberg test for the False Discovery Rate not valid because the tests are not independent Even so, 10 P-values that are less than 0.005 (out of 91 models tested) seems to be unlikely to occur by chance 6

COMBINED ESTIMATES FOR 80 GULF COAST STATIONS 7

1. Initial analysis not promising: many stations showed no statisticially significant trend 2. Definition of SST variable: Settled on averages of lags 7 12 months 3. Based on 7-day maxima, 16 (out of 80) have a statistically significant linear trend at p = 0.05, 13 have a statistically significant SST trend at p = 0.005 4. Must be wary of selection bias issues, but these are still greater numbers than could be explained just by chance 5. Therefore, proceed to a full spatial analysis 8

Spatial Analysis (Similar to: D.M. Holland, V. De Oliveira, L.H. Cox and R.L. Smith (2000), Estimation of regional trends in sulfur dioxide over the eastern United States. Environmetrics 11, 373-393.) Let Z be full spatial field of the linear trend (Z(s) is true trend at location s). Let Ẑ be estimated spatial field at stations s 1,..., s n. Model: Ẑ Z N (Z, W), Z N (0, V(θ)) where W is assumed known and V(θ) is the covariance matrix of some spatial field I use exponential covariance matrix where θ is range. So Ẑ N (0, W + V(θ)) and Z may be reconstructed from Ẑ by kriging (Estimation of W uses bootstrapping) 9

I estimated this model using 1. Ẑ is vector of linear trend estimates at each station 2. Ẑ is vector of SST trend estimates at each station 3. Also considered adjusted analysis: put in both linear trend and residuals from SSTs regressed on linear trend (two covariates in same model, kriging done separately for each) 10

RESULTS 11

Estimates of Overall Trend Parameter Model Estimate Standard Error t Statistics P-value SST alone 0.67 0.23 2.91 0.004 SST adjusted 0.53 0.18 2.86 0.004 Linear alone 1.01 0.66 1.52 0.13 Linear adjusted 1.03 0.31 3.345 0.0008 12

Image Plot Based on SST Trends latpred 25 26 27 28 29 30 31 1.5 1.0 0.5 0.0 96 94 92 90 88 86 84 82 lonpred 13

Image Plot Based on Adjusted SST Trends latpred 25 26 27 28 29 30 31 1.5 1.0 0.5 0.0 0.5 96 94 92 90 88 86 84 82 lonpred 14

Image Plot Based on Linear Trends latpred 25 26 27 28 29 30 31 2.0 1.5 1.0 0.5 96 94 92 90 88 86 84 82 lonpred 15

Image Plot Based on Adjusted Linear Trends latpred 25 26 27 28 29 30 31 2.0 1.5 1.0 0.5 0.0 96 94 92 90 88 86 84 82 lonpred 16

Noboot option Instead of using the bootstrap to estimate the matrix W, we simply assumed W was diagonal, with diagonal entries corresponding to the squares of the standard errors in the GEV fitting procedure (in other words: estimates are independent from station to station) The spatial model fitting procedure was repeated, and the new images plotted. The results were very little different. 17

Estimates of Overall Trend Parameter Model Estimate Standard Error t Statistics P-value SST alone 0.61 0.21 2.92 0.003 SST adjusted 0.54 0.21 2.58 0.01 Linear alone 0.81 0.80 1.02 0.31 Linear adjusted 1.01 0.47 2.13 0.03 18

Image Plot Based on SST Trends (noboot) latpred 25 26 27 28 29 30 31 1.0 0.5 0.0 96 94 92 90 88 86 84 82 lonpred 19

Image Plot Based on Adjusted SST Trends (noboot) latpred 25 26 27 28 29 30 31 1.5 1.0 0.5 0.0 0.5 96 94 92 90 88 86 84 82 lonpred 20

Image Plot Based on Linear Trends (noboot) latpred 25 26 27 28 29 30 31 2.0 1.5 1.0 0.5 96 94 92 90 88 86 84 82 lonpred 21

Image Plot Based on Adjusted Linear Trends (noboot) latpred 25 26 27 28 29 30 31 2.0 1.5 1.0 0.5 96 94 92 90 88 86 84 82 lonpred 22

To do: 1. Rerun the bootstrap procedure to correct for a possible error in constructing the bootstrap samples (discussed during the talk) 2. Interpolate α, σ, ξ in same way and hence construct estimates of threshold exceedance probabilities at different locations under the spatially interpolated model 3. Remark: Direct interpolation of threshold probabilities from site to site doesn t work; estimates are too variable for the spatial model to fit 23

CONCLUSIONS 1. Adjusted model shows a clear separation in contributions of the linear and SST components 2. SST component seems particularly concentrated on Houston, whether adjusted or not 3. Linear trend shows a less definitive pattern 4. Future work: (a) Draw similar plots for estimated probabilities of a Harveylevel exceedance (b) Compare extreme value probabilities for different dates (e.g. 2017 v. 1950) (c) Integrate with CMIP5 model projections for future extreme probabilities (d) POT alternative analysis? 24

TIME SERIES ANALYSIS FOR CLIMATE DATA I Overview II The post-1998 hiatus in temperature trends III NOAA s record streak IV Trends or nonstationarity? 25

TIME SERIES ANALYSIS FOR CLIMATE DATA I Overview II The post-1998 hiatus in temperature trends III NOAA s record streak IV Trends or nonstationarity? 26

HadCRUT4 gl Global Temperatures www.cru.uea.ac.uk Global Temperature Anomaly 0.4 0.2 0.0 0.2 0.4 Slope 0.74 degrees/century OLS Standard error 0.037 1900 1920 1940 1960 1980 2000 Year 27

What s wrong with that picture? We fitted a linear trend to data which are obviously autocorrelated OLS estimate 0.74 deg C per century, standard error 0.037 So it looks statistically significant, but question how standard error is affected by the autocorrelation First and simplest correction to this: assume an AR(1) time series model for the residual So I calculated the residuals from the linear trend and fitted an AR(1) model, X n = φ 1 X n 1 + ɛ n, estimated ˆφ 1 = 0.62 with standard error 0.07. With this model, the standard error of the OLS linear trend becomes 0.057, still making the trend very highly significant But is this an adequate model? 28

ACF of Residuals from Linear Trend ACF 0.0 0.2 0.4 0.6 0.8 1.0 Sample ACF AR(1) 0 5 10 15 20 Lag 29

Fit AR(p) of various orders p, calculate log likelihood, AIC, and the standard error of the linear trend. Model X n = p i=1 φ ix n i + ɛ n, ɛ n N[0, σ 2 ɛ ] (IID) AR order LogLik AIC Trend SE 0 72.00548 140.0110 0.036 1 99.99997 193.9999 0.057 2 100.13509 192.2702 0.060 3 101.84946 193.6989 0.069 4 105.92796 199.8559 0.082 5 106.12261 198.2452 0.079 6 107.98867 199.9773 0.086 7 108.16547 198.3309 0.089 8 108.16548 196.3310 0.089 9 108.41251 194.8250 0.086 10 108.48379 192.9676 0.087 30

Extend the calculation to ARMA(p,q) for various p and q: model is X n p i=1 φ ix n i = ɛ n + q j=1 θ jɛ n j, ɛ n N[0, σɛ 2 ] (IID) AR order MA order 0 1 2 3 4 5 0 140.0 177.2 188.4 186.4 191.5 192.0 1 194.0 193.0 197.4 195.5 201.7 199.8 2 192.3 193.0 195.4 199.2 200.8 199.1 3 193.7 197.2 200.3 197.9 200.8 200.1 4 199.9 199.6 199.8 197.8 196.8 197.4 5 198.2 198.8 197.8 195.8 194.8 192.8 6 200.0 198.3 196.4 195.7 196.5 199.6 7 198.3 196.3 200.2 199.1 194.6 197.3 8 196.3 195.8 194.4 192.5 192.8 196.4 9 194.8 194.4 197.6 197.9 196.2 194.4 10 193.0 192.5 195.0 191.2 194.9 192.4 SE of trend based on ARMA(1,4) model: 0.087 deg C per century 31

ACF of Residuals from Linear Trend ACF 0.0 0.2 0.4 0.6 0.8 1.0 Sample ACF AR(6) ARMA(1,4) 0 5 10 15 20 Lag 32

Calculating the standard error of the trend Estimate ˆβ = n i=1 w i X i, variance σɛ 2 n n i=1 j=1 w i w j ρ i j where ρ is the autocorrelation function of the fitted ARMA model Alternative formula (Bloomfield and Nychka, 1992) Variance(ˆβ) = 2 1/2 0 w(f)s(f)df where s(f) is the spectral density of the autocovariance function and w(f) = is the transfer function n j=1 w n e 2πijf 2 33

Example based on Barnes and Barnes (Journal of Climate, 2015) They compared the OLS estimator of a linear regression with the epoch estimator computed by taking the difference between the first M and last M values of a series of length N, for some M < N 2. The epoch estimator is then rescaled so that it is in the same units as the classical linear regression estimator. Question: Which performs better under various time series assumptions on the underlying series? The following plot shows the transfer functions for N = 100, M = 33, epoch estimator in red, OLS in blue. Generally the OLS estimator is better, but not if the series has a spectral peak near 0.03. 34

Transfer Function Times 1000 0.0 0.2 0.4 0.6 Ratio of Mean Red and Blue Curves is 1.124972 0.00 0.02 0.04 0.06 0.08 0.10 Frequency 35

What s better than the OLS linear trend estimator? Use generalized least squares (GLS) y n = β 0 + β 1 x n + u n, u n ARMA(p, q) Repeat same process with AIC: ARMA(1,4) again best ˆβ = 0.73, standard error 0.10. 36

Calculations in R ip=4 iq=1 ts1=arima(y2,order=c(ip,0,iq),xreg=1:ny,method= ML ) Coefficients: ar1 ar2 ar3 ar4 ma1 intercept 1:ny 0.0058 0.2764 0.0101 0.3313 0.5884-0.4415 0.0073 s.e. 0.3458 0.2173 0.0919 0.0891 0.3791 0.0681 0.0010 sigma^2 estimated as 0.009061: log likelihood = 106.8, aic = -197.59 acf1=armaacf(ar=ts1$coef[1:ip],ma=ts1$coef[ip+1:iq],lag.max=150) 37

TIME SERIES ANALYSIS FOR CLIMATE DATA I Overview II The post-1998 hiatus in temperature trends III NOAA s record streak IV Trends or nonstationarity? 38

HadCRUT4 gl Temperature Anomalies 1960 2014 Global Temperature Anomaly 0.2 0.0 0.2 0.4 1960 1970 1980 1990 2000 2010 Year 39

HadCRUT4 gl Temperature Anomalies 1960 2014 Global Temperature Anomaly 0.2 0.0 0.2 0.4 1960 1970 1980 1990 2000 2010 Year 40

NOAA Temperature Anomalies 1960 2014 Global Temperature Anomaly 0.0 0.2 0.4 0.6 1960 1970 1980 1990 2000 2010 Year 41

NOAA Temperature Anomalies 1960 2014 Global Temperature Anomaly 0.0 0.2 0.4 0.6 1960 1970 1980 1990 2000 2010 Year 42

GISS (NASA) Temperature Anomalies 1960 2014 Global Temperature Anomaly 0.2 0.0 0.2 0.4 0.6 1960 1970 1980 1990 2000 2010 Year 43

GISS (NASA) Temperature Anomalies 1960 2014 Global Temperature Anomaly 0.2 0.0 0.2 0.4 0.6 1960 1970 1980 1990 2000 2010 Year 44

Berkeley Earth Temperature Anomalies 1960 2014 Global Temperature Anomaly 0.2 0.0 0.2 0.4 0.6 1960 1970 1980 1990 2000 2010 Year 45

Berkeley Earth Temperature Anomalies 1960 2014 Global Temperature Anomaly 0.2 0.0 0.2 0.4 0.6 1960 1970 1980 1990 2000 2010 Year 46

Cowtan Way Temperature Anomalies 1960 2014 Global Temperature Anomaly 0.2 0.0 0.2 0.4 0.6 1960 1970 1980 1990 2000 2010 Year 47

Cowtan Way Temperature Anomalies 1960 2014 Global Temperature Anomaly 0.2 0.0 0.2 0.4 0.6 1960 1970 1980 1990 2000 2010 Year 48

Statistical Models Let t 1i : ith year of series y i : temperature anomaly in year t i t 2i = (t 1i 1998) + y i = β 0 + β 1 t 1i + β 2 t 2i + u i Simple linear regression (OLS): u i N[0, σ 2 ] (IID) Time series regression (GLS): u i φ 1 u i 1... φ p u i p = ɛ i + θ 1 ɛ i 1 +... + θ q ɛ i q, ɛ i N[0, σ 2 ] (IID) Fit using arima function in R 49

HadCRUT4 gl Temperature Anomalies 1960 2014 OLS Fit, Changepoint at 1998 Global Temperature Anomaly 0.2 0.0 0.2 0.4 Change of slope 0.85 deg/cen (SE 0.50 deg/cen) 1960 1970 1980 1990 2000 2010 Year 50

HadCRUT4 gl Temperature Anomalies 1960 2014 GLS Fit, Changepoint at 1998 Global Temperature Anomaly 0.2 0.0 0.2 0.4 Change of slope 1.16 deg/cen (SE 0.4 deg/cen) 1960 1970 1980 1990 2000 2010 Year 51

NOAA Temperature Anomalies 1960 2014 GLS Fit, Changepoint at 1998 Global Temperature Anomaly 0.0 0.2 0.4 0.6 Change of slope 0.21 deg/cen (SE 0.62 deg/cen) 1960 1970 1980 1990 2000 2010 Year 52

GISS (NASA) Temperature Anomalies 1960 2014 GLS Fit, Changepoint at 1998 Global Temperature Anomaly 0.2 0.0 0.2 0.4 0.6 Change of slope 0.29 deg/cen (SE 0.54 deg/cen) 1960 1970 1980 1990 2000 2010 Year 53

Berkeley Earth Temperature Anomalies 1960 2014 GLS Fit, Changepoint at 1998 Global Temperature Anomaly 0.2 0.0 0.2 0.4 0.6 Change of slope 0.74 deg/cen (SE 0.6 deg/cen) 1960 1970 1980 1990 2000 2010 Year 54

Cowtan Way Temperature Anomalies 1960 2014 GLS Fit, Changepoint at 1998 Global Temperature Anomaly 0.2 0.0 0.2 0.4 0.6 Change of slope 0.93 deg/cen (SE 1.24 deg/cen) 1960 1970 1980 1990 2000 2010 Year 55

Adjustment for the El Niño Effect El Niño is a weather effect caused by circulation changes in the Pacific Ocean 1998 was one of the strongest El Niño years in history A common measure of El Niño is the Southern Oscillation Index (SOI), computed monthly Here use SOI with a seven-month lag as an additional covariate in the analysis 56

HadCRUT4 gl With SOI Signal Removed GLS Fit, Changepoint at 1998 Global Temperature Anomaly 0.2 0.0 0.2 0.4 Change of slope 0.81 deg/cen (SE 0.75 deg/cen) 1960 1970 1980 1990 2000 2010 Year 57

NOAA With SOI Signal Removed GLS Fit, Changepoint at 1998 Global Temperature Anomaly 0.0 0.2 0.4 0.6 Change of slope 0.18 deg/cen (SE 0.61 deg/cen) 1960 1970 1980 1990 2000 2010 Year 58

GISS (NASA) With SOI Signal Removed GLS Fit, Changepoint at 1998 Global Temperature Anomaly 0.2 0.0 0.2 0.4 0.6 Change of slope 0.24 deg/cen (SE 0.54 deg/cen) 1960 1970 1980 1990 2000 2010 Year 59

Berkeley Earth With SOI Signal Removed GLS Fit, Changepoint at 1998 Global Temperature Anomaly 0.2 0.0 0.2 0.4 0.6 Change of slope 0.69 deg/cen (SE 0.58 deg/cen) 1960 1970 1980 1990 2000 2010 Year 60

Cowtan Way With SOI Signal Removed GLS Fit, Changepoint at 1998 Global Temperature Anomaly 0.2 0.0 0.2 0.4 0.6 Change of slope 0.99 deg/cen (SE 0.79 deg/cen) 1960 1970 1980 1990 2000 2010 Year 61

Selecting The Changepoint If we were to select the changepoint through some form of automated statistical changepoint analysis, where would we put it? 62

HadCRUT4 gl Change Point Posterior Probability Posterior Probability of Changepoint 0.00 0.04 0.08 0.12 1960 1970 1980 1990 2000 2010 Year 63

Conclusion from Temperature Trend Analysis No evidence of decrease post-1998 if anything, the trend increases after this time After adjusting for El Niño, even stronger evidence for a continuously increasing trend If we were to select the changepoint instead of fixing it at 1998, we would choose some year in the 1970s Thus: No statistical evidence to support the hiatus hypothesis 64

TIME SERIES ANALYSIS FOR CLIMATE DATA I Overview II The post-1998 hiatus in temperature trends III NOAA s record streak IV Trends or nonstationarity? 65

66

Continental US monthly temperatures, Jan 1895 Oct 2012. For each month between June 2011 and Sep 2012, the monthly temperature was in the top tercile of all observations for that month up to that point in the time series. Attention was first drawn to this in June 2012, at which point the series of top tercile events was 13 months long, leading to a naïve calculation that the probability of that event was (1/3) 13 = 6.3 10 7. Eventually, the streak extended to 16 months, but ended at that point, as the temperature for Oct 2012 was not in the top tercile. In this study, we estimate the probability of either a 13-month or a 16-month streak of top-tercile events, under various assumptions about the monthly temperature time series. 67

68

69

70

71

Method Two issues with NOAA analysis: Neglects autocorrelation Ignores selection effect Solutions: Fit time series model ARMA or long-range dependence Use simulation to determine the probability distribution of the longest streak in 117 years Some of the issues: Selection of ARMA model AR(1) performs poorly Variances differ by month must take that into account Choices of estimation methods, e.g. MLE or Bayesian Bayesian methods allow one to take account of parameter estimation uncertainty 72

73

74

Conclusions It s important to take account of monthly varying standard deviations as well as means. Estimation under a high-order ARMA model or fractional differencing lead to very similar results, but don t use AR(1). In a model with no trend, the probability that there is a sequence of length 16 consecutive top-tercile observations somewhere after year 30 in the 117-year time series is of the order of 0.01 0.03, depending on the exact model being fitted. With a linear trend, these probability rise to something over.05. Include a nonlinear trend, and the probabilities are even higher in other words, not surprising at all. Overall, the results may be taken as supporting the overall anthropogenic influence on temperature, but not to a stronger extent than other methods of analysis. 75

TIME SERIES ANALYSIS FOR CLIMATE DATA I Overview II The post-1998 hiatus in temperature trends III NOAA s record streak IV Trends or nonstationarity? 76

A Parliamentary Question is a device where any member of the U.K. Parliament can ask a question of the Government on any topic, and is entitled to expect a full answer. 77

www.parliament.uk, April 22, 2013 78

79

Essence of the Met Office Response Acknowledged that under certain circumstances an ARIMA(3,1,0) without drift can fit the data better than an AR(1) model with drift, as measured by likelihood The result depends on the start and finish date of the series Provides various reasons why this should not be interpreted as an argument against climate change Still, it didn t seem to me (RLS) to settle the issue beyond doubt 80

There is a tradition of this kind of research going back some time 81

82

Summary So Far Integrated or unit root models (e.g. ARIMA(p, d, q) with d = 1) have been proposed for climate models and there is some statistical support for them If these models are accepted, the evidence for a linear trend is not clear-cut Note that we are not talking about fractionally integrated models (0 < d < 2 1 ) for which there is by now a substantial tradition. These models have slowly decaying autocorrelations but are still stationary Integrated models are not physically realistic but this has not stopped people advocating them I see the need for a more definitive statistical rebuttal 83

HadCRUT4 Global Series, 1900 2012 Model I : y t y t 1 = ARMA(p, q) (mean 0) Model II : y t = Linear Trend + ARMA(p, q) Model III : y t y t 1 = Nonlinear Trend + ARMA(p, q) Model IV : y t = Nonlinear Trend + ARMA(p, q) Use AICC as measure of fit 84

Integrated Time Series, No Trend p q 0 1 2 3 4 5 0 165.4 178.9 182.8 180.7 187.7 185.8 1 169.2 181.3 180.7 184.1 186.3 184.4 2 176.0 182.8 185.7 182.7 184.7 184.4 3 185.5 184.2 185.2 183.0 184.4 184.0 4 183.5 181.5 183.0 180.7 181.5 NA 5 185.2 183.1 181.0 185.8 183.6 182.5 85

Stationary Time Series, Linear Trend p q 0 1 2 3 4 5 0 136.1 168.8 178.2 176.2 180.9 181.6 1 183.1 183.1 186.8 184.5 190.8 188.5 2 181.3 181.7 184.5 187.4 189.2 187.3 3 182.6 186.6 188.9 187.1 189.3 187.3 4 189.7 188.7 188.4 185.4 185.1 NA 5 187.9 187.6 186.0 183.0 182.6 183.8 86

Integrated Time Series, Nonlinear Trend p q 0 1 2 3 4 5 0 156.8 195.1 201.7 199.4 207.7 208.9 1 161.4 199.5 199.4 202.3 210.3 209.0 2 169.9 202.3 210.0 201.4 209.7 208.7 3 183.2 201.0 203.5 201.2 207.3 204.8 4 180.9 199.3 201.2 198.7 205.3 NA 5 186.8 201.7 199.4 207.7 204.8 204.8 87

Stationary Time Series, Nonlinear Trend p q 0 1 2 3 4 5 0 199.1 204.6 202.4 217.8 216.9 215.9 1 202.6 202.3 215.2 217.7 216.1 214.7 2 205.2 217.3 205.0 216.6 214.1 213.3 3 203.8 205.9 203.6 214.3 211.7 213.5 4 202.2 203.5 213.7 212.0 227.1 NA 5 205.7 203.2 216.2 233.3 212.6 226.5 88

Integrated Mean 0 Stationary Linear Trend Series 0.2 0.1 0.0 0.1 0.2 0.3 Series 0.4 0.2 0.0 0.2 0.4 1900 1920 1940 1960 1980 2000 Year 1900 1920 1940 1960 1980 2000 Year Integrated Nonlinear Trend Stationary Nonlinear Trend Series 0.2 0.1 0.0 0.1 0.2 0.3 Series 0.4 0.2 0.0 0.2 0.4 1900 1920 1940 1960 1980 2000 Year 1900 1920 1940 1960 1980 2000 Year Four Time Series Models with Fitted Trends 89

Integrated Mean 0 Stationary Linear Trend Residual 0.2 0.1 0.0 0.1 0.2 Residual 0.2 0.1 0.0 0.1 0.2 0 20 40 60 80 100 Year 0 20 40 60 80 100 Year Integrated Nonlinear Trend Stationary Nonlinear Trend Residual 0.1 0.0 0.1 Residual 0.15 0.05 0.05 0.15 0 20 40 60 80 100 Year 0 20 40 60 80 100 Year Residuals From Four Time Series Models 90

Integrated Mean 0 Stationary Linear Trend Residual 0.2 0.1 0.0 0.1 0.2 Residual 0.2 0.1 0.0 0.1 0.2 0 20 40 60 80 100 Year 0 20 40 60 80 100 Year Integrated Nonlinear Trend Stationary Nonlinear Trend Residual 0.1 0.0 0.1 Residual 0.15 0.05 0.05 0.15 0 20 40 60 80 100 Year 0 20 40 60 80 100 Year Residuals From Four Time Series Models 91

Conclusions If we restrict ourselves to linear trends, there is not a clearcut preference between integrated time series models without a trend and stationary models with a trend However, if we extend the analysis to include nonlinear trends, there is a very clear preference that the residuals are stationary, not integrated Possible extensions: Add fractionally integrated models to the comparison Bring in additional covariates, e.g. circulation indices and external forcing factors Consider using a nonlinear trend derived from a climate model. That would make clear the connection with detection and attribution methods which are the preferred tool for attributing climate change used by climatologists. 92