Doing science with multi-model ensembles

Doing science with multi-model ensembles Gerald A. Meehl National Center for Atmospheric Research Biological and Energy Research Regional and Global Climate Modeling Program

Why use a multi-model ensemble average? A multi-model average often out-performs any individual model compared to observations. --Demonstrated for mean climate (Gleckler et al., 2008; Reichler and Kim, 2008) --Detection and attribution (Zhang et al., 2007) --Statistics of variability (Pierce et al., 2009) --Some systematic biases (i.e., evident in most or all models) can be readily identified in multimodel averages (Knutti et al., 2010) Multivariate metric for mean climate simulation (Reichler and Kim, 2008) Multi-model average Individual models Better simulation

Best practice for analysis of multi-model ensembles (Knutti et al., 2010: Good Practice Guidance Paper on Assessing and Combining Multi Model Climate Projections, IPCC) How do you produce a multi-model average when there are multiple ensemble members of different sizes for each model? 1. The default is to take all models in multi-model ensemble, and either use one realization from each model, or average the ensemble members from each individual model, and then average those together But I just want to use the best models in my multi-model average how do I define what the best models are? Model ranking depends on the metrics applied different metrics give different rankings 2. A subset of models (from which a multi-model average can be computed) can be taken from the total collection of models if a physical reason can be supplied to justify the choice of which models are in the subset 3. New methods are being developed to weight models (given that many models are not independent see Ben Sanderson and Reto Knutti talks Thursday) 4. Emergent constraints can provide guidance 5. Comparison to observations should also take into account uncertainty in observations (see Ben Santer talk Thursday)

Model simulations in CMIP5 have improved compared to previous CMIP phases Model ranking and fidelity across CMIP generations using only temperature and precipitation (Knutti et al., 2013, GRL) Most models are strongly tied to their predecessors, and some also exchange ideas and code with other models, thus supporting an earlier hypothesis that the models in the new ensemble are neither independent of each other nor independent of the earlier generation. Multi-model mean Multi-model median Better simulation Geophysical Research Letters Volume 40, Issue 6, pages 1194-1199, 26 MAR 2013 DOI: 10.1002/grl.50256 http://onlinelibrary.wiley.com/doi/10.1002/grl.50256/full#grl50256-fig-0003

Model weighting based on model independence and skill (Sanderson and Wehner, 2016; Sanderson, Knutti, Caldwell, 2016)

emergent constraints can be used in a multi-model context to quantify relevant processes for climate system response (find a metric in the climate system that incorporates a feedback and a temperature response) best models from this metric Snow-albedo feedback and the seasonal cycle, compared to snow-albedo feedback and global warming --warming of springtime temperature divided by reduction of land surface albedo, x axis --warming from increasing CO2 divided by reduction of land surface albedo, y axis) (Hall and Qu, 2006, GRL)

A traditional multi-model result from the IPCC AR5 (CMIP5 models) IPCC AR5 Fig. SPM.7 (multi-model average using a single realization from each model, and 5-95% (+/- 1.64 standard deviation) uncertainty ranges

A subset of 5 models selected based on agreement with recent observed average sea ice thickness and trends of sea ice extent IPCC AR5 Fig. SPM.7 (five-model average using a single realization from each model, and model minimum-maximum uncertainty ranges)

IPCC AR5 Chapter 9, Fig. 9.8 What about the early-2000s hiatus? Models reproduce observed temperature trends over many decades, including the more rapid warming since the mid- 20th century and the cooling immediately following large volcanic eruptions (very high confidence). Early 21 st century global mean surface temperature slowdown 1998 2012: 0.04 ºC/decade 1951 2012: 0.11 ºC/decade Chapter 9, Fig. 9.8 WGI AR5 Final Draft 07 June

Using CMIP5 multi-model ensemble averages to attribute Human influence on the warming climate to anthropogenic system is clear forcings IPCC AR5 Figure SPM.6 It is extremely likely that human influence has been the dominant cause of the observed warming since the mid-20th century. Figure SPM.6 WGI AR5 Final Draft 07 June

The time evolution of the observed climate system is a combination of externallyforced response and internally generated climate variability Dominant pattern of internally-generated variability from control runs (left) and observed IPO pattern (right): (Meehl, Hu, Santer and Xie, 2016, Nature Climate Change) A five model ensemble with single forcings shows patterns and time evolution of response Sulfate aerosols: GHGs:

Recent slow down in global surface temperature increase the early-2000s slowdown (2001-2014, negative phase of the Interdecadal Pacific Oscillation, IPO) is characterized by a trend that is significantly less than the previous positive IPO period from 1972-2001 (Fyfe et al., 2016, Nature Clim. Chg)

A multi-model ensemble average removes internally-generated variability ( noise ) and leaves the externally-forced response But how does internally generated decadal climate variability, in the single realization we have for the observations, combine with the externally forced response to produce what we have observed? We are interested in understanding the sources and processes that produce the climate noise, and the interplay with the externally forced response (from GHGs, volcanoes, etc.); relevant for decadal climate prediction Can we use the CMIP5 multimodel data set to quantify the contribution of the Interdecadal Pacific Oscillation to GMST epoch trends? Positive IPO Negative IPO Positive IPO Negative IPO (Meehl, Hu, Santer and Xie, 2016, Nature Climate Change)

Observed trends Externally forced trend (CMIP5 multi-model ensemble) IPO-adjusted trend 71% 25% initial pulse 75% 27% 25% IPO contribution to difference between median values of Positive IPO Negative IPO Positive IPO Negative IPO forced trend and observed trend How much could the IPO contribute to GMST trends? --Compute distribution of decadal GMST trends for IPO positive and negative phases in 1100 year control run --adjust externally forced trends from CMIP5 multi-model mean with IPO-related trends --compare IPO-adjusted GMST trends with observed GMST trends (Meehl, Hu, Santer and Xie, 2016, Nature Climate Change)

Temperatures have flatlined over the past 15 years and to my knowledge, not a single climate model ever predicted that a pause in global warming would ever occur. --Senator James Inhofe (R-Okla.) in U.S. Senate hearing on the Obama Climate Action Plan on January 16, 2014 (quoted in Eos, January 28, 2014)

Some CMIP5 uninitialized models actually simulated the slowdown Tend to be characterized by a negative phase of the IPO internally generated variability in those model simulations happened to sync with observed internally generated variability Total: 262 possible simulations 2000-2012 slowdown: 21 2000-2014 slowdown: 9 2000-2015 slowdown: 6 2000-2016 slowdown: 6 2000-2017 slowdown: 1 2000-2018: 1 Slowdown as observed from 2000-2013: 10 members out of 262 possible realizations (Meehl et al., 2014, Nature Climate Change)

Larger increasing trends of Antarctic sea ice since 2000 associated with negative IPO phase, deeper Amundsen Sea Low, stronger northward surface winds in the Pacific sector Multi-model ensemble mean shows Antarctic sea ice decreases But ten of the model ensemble members simulate the 2000-2014 global surface warming slowdown and also simulate negative IPO phase with increasing Antarctic sea ice Antarctic sea ice anomalies traced to SST and precipitation anomalies in eastern equatorial Pacific with negative IPO phase in specified convective heating anomaly climate model experiment (Meehl et al., July, 2016, Nature Geoscience)

Depicting uncertainty in a CMIP5 multi-model ensemble (IPCC AR5, Fig. SPM.8) Stippling indicates multi-model mean is more than two standard deviations of natural internal variability in 20-yr means; hatching indicates multi-model mean is less than one standard deviation of natural internal variability in 20-yr means, and where at least 90% of models agree on the sign of change Number of models

The new field of decadal climate prediction seeks to use climate models initialized with observations to predict the time evolution of the statistics of regional climate over the near term (i.e. the next 10 years) by predicting the interplay between internal variability and response to increasing GHGs Initialized hindcasts/predictions specified in CMIP5 for the first time (ten year hindcasts initialized for every year starting in 1960) Can decadal climate variability processes and mechanisms, if properly initialized, provide increased prediction skill of the time evolution of regional climate in the near-term?

IPCC AR5 2016-2035 assessed temperature range is less than from uninitialized projections in part due to results from initialized decadal predictions in CMIP5 Uninitialized Figure 11.9a from IPCC AR5, ch 11 Initialized Figure 11.9b from IPCC AR5, ch 11 Assessed Temperature change Figure 11.25b from IPCC AR5, ch 11

Initialized CMIP5 simulations better simulate the mid- 1970s shift to the positive IPO phase, and the early-2000s hiatus negative IPO (5 year average, prediction for years 3-7) compared to free-running simulations (16 models; Stippling: multi-model ensemble mean +/- 2 standard deviations warmer/colder than observations as in Smith et al., 2012) lower left numbers: pattern correlation (area-mean removed)/ RMSE upper right: global T (monte carlo test: 1000 year CCSM4 control run, calculated pattern correlations of 100,000 random patterns, 95th percentile is a pattern correlation of 0.59) (Meehl and Teng, GRL, 2013)

CMIP5 multi-model data are very useful for climate science research But single models available in the CMIP5 data set are still useful to study processes

Using a single model (CCSM4) to address the hypothesis that offequatorial ocean heat content in the tropical western Pacific can provide the conditions for ENSO events to trigger a decadal timescale IPO transition (Meehl, Hu, Teng, 2016, Nature Communications)

Niño3.4 Previous IPO shifts showed qualitative agreement between initialized hindcasts and observations An El Niño could trigger an IPO transition to positive (mid-1970s) or a La Niña to IPO negative (late 1990s) after buildup of off-equatorial heat content anomalies in the western Pacific Predictions with CCSM4 initialized in 2013 show qualitative agreement, with above normal Niño3.4 SSTs in 2014-2015 (Meehl, Hu, Teng, 2016, Nature Communications) With the build-up of off-equatorial western Pacific heat content, this could trigger a transition to the positive phase of the IPO and larger rates of global warming

CCSM4 prediction initialized in 2013 indicates a positive phase of the IPO for 3-7 year average 2015-2019 This is quite different from persistence (2008-2012 persisted to 2015-2019) And is different from uninitialized projection for 2015-2019 (Meehl et al., 2016, Nature Communications)

Predicted rate of global warming from 2013 initial year greater than during early-2000s slowdown and greater than uninitialized: Observed 2001-2014: +0.08±0.05 C/decade Predicted 2013-2022: +0.22±0.13 C/decade Uninitialized 2013-2022: +0.14±0.12 C/decade (Meehl et al., 2016, Nature Communications)

Summary 1. Multi-model ensemble average usually outperforms any single model, and averages out internal variability to focus on forced response 2. To produce a multi-model average when there are multiple ensemble members of different sizes for each model: take all models in multi-model ensemble, and either use one realization from each model, or average the ensemble members from each individual model, and then average those together 3. Model ranking depends on the metrics applied different metrics give different rankings 4. A subset of models (from which a multi-model average can be computed) can be taken from the total collection of models if a physical reason can be supplied to justify the choice of which models are in the subset 5. New methods are being developed to weight models (given that many models are not independent see Ben Sanderson and Reto Knutti talks Thursday) 6. Emergent constraints can provide guidance 7. Comparison to observations should also take into account uncertainty in observations (see Ben Santer talk Thursday) 8. Single models available in the CMIP5 data set are still useful to study processes