Imperfect Data in an Uncertain World

Similar documents
RMS Medium Term Perspective on Hurricane Activity

Enhancing Weather Information with Probability Forecasts. An Information Statement of the American Meteorological Society

Global warming, U.S. hurricanes, and insured losses

Plotting Early Nineteenth-Century Hurricane Information

The increasing intensity of the strongest tropical cyclones

MAS2317/3317: Introduction to Bayesian Statistics

On Estimating Hurricane Return Periods

A Geostatistical Approach to Linking Geographically-Aggregated Data From Different Sources

Estimation of reliability parameters from Experimental data (Parte 2) Prof. Enrico Zio

Statistical Science: Contributions to the Administration s Research Priority on Climate Change

STAT 499/962 Topics in Statistics Bayesian Inference and Decision Theory Jan 2018, Handout 01

Confidence Distribution

1990 Intergovernmental Panel on Climate Change Impacts Assessment

Visibility network of United States hurricanes

July Forecast Update for Atlantic Hurricane Activity in 2017

Examples of extreme event modeling in tornado and hurricane research

Sensitivity of limiting hurricane intensity to ocean warmth

Critical Thinking Review

Statistics Introduction to Probability

April Forecast Update for Atlantic Hurricane Activity in 2018

Statistics Introduction to Probability

July Forecast Update for Atlantic Hurricane Activity in 2016

3/5/2013 UNCERTAINTY IN HURRICANE FREQUENCY: HOW MODELERS APPROACH THE PROBLEM PROBLEM STATEMENT

August Forecast Update for Atlantic Hurricane Activity in 2015

Models for models. Douglas Nychka Geophysical Statistics Project National Center for Atmospheric Research

April Forecast Update for Atlantic Hurricane Activity in 2016

Prediction Models for Annual U.S. Hurricane Counts

CLIMATE EXTREMES AND GLOBAL WARMING: A STATISTICIAN S PERSPECTIVE

April Forecast Update for North Atlantic Hurricane Activity in 2019

HOW TO INTERPRET PROBABILISTIC FORECASTS (IN PARTICULAR FOR WEATHER AND CLIMATE)

Exploring possibly increasing trend of hurricane activity by a SiZer approach

Parameter Estimation in the Spatio-Temporal Mixed Effects Model Analysis of Massive Spatio-Temporal Data Sets

A Climatology of Landfalling Hurricane Central Pressures Along the Gulf of Mexico Coast

Agricultural Outlook Forum Presented: February 17, 2006 THE SCIENCE BEHIND THE ATLANTIC HURRICANES AND SEASONAL PREDICTIONS

PRMS WHITE PAPER 2014 NORTH ATLANTIC HURRICANE SEASON OUTLOOK. June RMS Event Response

July Forecast Update for North Atlantic Hurricane Activity in 2018

Climate Outlook through 2100 South Florida Ecological Services Office Vero Beach, FL September 9, 2014

2013 ATLANTIC HURRICANE SEASON OUTLOOK. June RMS Cat Response

Detecting Shifts in Hurricane Rates Using a Markov Chain Monte Carlo Approach

NSF Expeditions in Computing. Understanding Climate Change: A Data Driven Approach. Vipin Kumar University of Minnesota

Pre-Season Forecast for North Atlantic Hurricane Activity in 2018

Data Mining Chapter 4: Data Analysis and Uncertainty Fall 2011 Ming Li Department of Computer Science and Technology Nanjing University

Advanced Statistical Methods. Lecture 6

A Hierarchical Bayesian Approach to Seasonal Hurricane Modeling

August Forecast Update for Atlantic Hurricane Activity in 2016

At the Midpoint of the 2008

Estimation of Operational Risk Capital Charge under Parameter Uncertainty

Southwest Climate Change Projections Increasing Extreme Weather Events?

Physical Geography: Patterns, Processes, and Interactions, Grade 11, University/College Expectations

2. A Basic Statistical Toolbox

Weather Analysis and Forecasting

By: Jill F. Hasling, Certified Consulting. Houston, Texas. November 2016

Combining Incompatible Spatial Data

Bayesian Inference. Chapter 1. Introduction and basic concepts

Uncertainties in estimates of the occurrence rate of rare space weather events

Climate Change Adaptation for ports and navigation infrastructure

A Statistical-Dynamical Seasonal Forecast of US Landfalling TC Activity

Graduate Courses Meteorology / Atmospheric Science UNC Charlotte

Analysis of Regression and Bayesian Predictive Uncertainty Measures

Answers and expectations

Comparative Analysis of Hurricane Vulnerability in New Orleans and Baton Rouge. Dr. Marc Levitan LSU Hurricane Center. April 2003

Research on Climate of Typhoons Affecting China

Extreme Weather Events and Climate Change

Christopher Jones Department of Mathematics, University of North Carolina at Chapel Hill And Warwick Mathematics Institute, University of Warwick

Probabilistic Reasoning

August Forecast Update for Atlantic Hurricane Activity in 2012

Gamma process model for time-dependent structural reliability analysis

SEASONAL ENVIRONMENTAL CONDITIONS RELATED TO HURRICANE ACTIVITY IN THE NORTHEAST PACIFIC BASIN

(1) Introduction to Bayesian statistics

Use of GIS in Plotting Early 19 th Century Hurricane Information (Part 2)

NRCSE. Statistics, data, and deterministic models

J3.13 A SPACE-TIME MODEL FOR SEASONAL HURRICANE PREDICTION :

Principles of Bayesian Inference

AnuMS 2018 Atlantic Hurricane Season Forecast

Lecture 5. G. Cowan Lectures on Statistical Data Analysis Lecture 5 page 1

THE SIGNIFICANCE OF SYNOPTIC PATTERNS IDENTIFIED BY THE KIRCHHOFER TECHNIQUE: A MONTE CARLO APPROACH

The impact of climate change on wind energy resources

Detection and attribution, forced changes, natural variability, signal and noise, ensembles

Improved Fields of Satellite-Derived Ocean Surface Turbulent Fluxes of Energy and Moisture

Claim: Global warming is increasing the magnitude and frequency of droughts and floods. REBUTTAL

Introduction to Bayesian Inference

Natural Hazards and Disaster. Lab 1: Risk Concept Risk Probability Questions

Bayesian Inference. Introduction

Comments by William M. Gray (Colorado State University) on the recently published paper in Science by Webster, et al

Applied Bayesian Statistics STAT 388/488

AnuMS 2018 Atlantic Hurricane Season Forecast

Forecasting U.S. Hurricanes 6 Months in Advance

Understanding Climatological Influences on Hurricane Activity: The AIR Near-term Sensitivity Catalog

arxiv:physics/ v1 [physics.ao-ph] 7 Nov 2006

Hurricane Climatology

Statistics for extreme & sparse data

Bayesian analysis in nuclear physics

Bayesian inference. Fredrik Ronquist and Peter Beerli. October 3, 2007

SIXTH INTERNATIONAL WORKSHOP on TROPICAL CYCLONES

Probabilities for climate projections

The 2017 Statistical Model for U.S. Annual Lightning Fatalities

Forecasting U.S. hurricanes 6 months in advance

Fundamentals of Data Assimila1on

Reasoning with Uncertainty

Reply to Hurricanes and Global Warming Potential Linkages and Consequences

Transcription:

Imperfect Data in an Uncertain World James B. Elsner Department of Geography, Florida State University Tallahassee, Florida Corresponding author address: Dept. of Geography, Florida State University Tallahassee, FL 32306 Tel: 850-644-8374, Fax: 850-644-5913 E-mail: jelsner@garnet.fsu.edu Kam-biu Liu Department of Geography and Anthropology Louisiana State University, Baton Rouge, LA Thomas H. Jagger Department of Geography Florida State University, Tallahassee, FL June 4, 2005 c J.B. Elsner, K.-b. Liu, & T.H. Jagger 1

Abstract Bayesian analysis and modeling, in which uncertainties are quantified in terms of probability, offers an alternative approach to understanding in meteorological applications. The underlying principle, practiced in fields like archaeology and geology, is the accumulation of evidence. The approach provides a mathematical rule to update existing beliefs in light of new evidence. It requires the data be neither uniform in precision nor the evidence complete. Expressing research results in Bayesian terms makes them simpler to understand and makes inconclusive results less prone to misinterpretation. This note shows that a Bayesian analysis produces an arguably more precise estimate of the long-term U.S. hurricane rate. Introduction Data are now widely collected and synthesized using global positioning and geographic information systems. Advances in satellite imagery and remote sensing technologies allow unprecedented access to temporal and spatial data at various resolutions. This is in contrast to data collected with older technologies, which are, in general, less precise and more uncertain. Here an attempt is made to increase awareness of a perspective on using disparate information especially information that is more uncertain that broadens the way modeling and analysis is done. The goal is to motivate the consideration of the Bayesian approach in meteorology and allied fields. The view is different from the classical perspective, but it is useful when one s knowledge of the system is incomplete. The theoretical basis is borrowed from fields like archaeology and palaeoclimatology while the formalism derives from Bayesian statistics. 2

Background In meteorology and related disciplines, statistical inference is traditionally taught from the frequentist perspective. To illustrate, suppose we are interested in the risk of hurricanes to New Orleans. More precisely, we want to know the number of hurricanes likely to strike New Orleans over a given number of future years. This is a random quantity that we assume has a Poisson distribution with a fixed, but unknown rate, parameter called lambda. The frequentist approach uses statistics to estimate unknown parameters. The parameter is assumed to be fixed and unknowable, except through the observed data. For the Poisson distribution we estimate lambda using the mean value. In our case, we simply divide the number of storms observed by the number of observation years. The mean value statistic for the Poisson distribution is a good statistic because it is the minimum variance unbiased estimate for lambda. This implies that it has the smallest confidence interval for a given percentile on the estimate of lambda. Suppose in a reliable sample of 100 years, 8 hurricanes affect New Orleans. The sample annual mean number of hurricanes is 0.08. Inferential statistics is about making inferences on parameters of a distribution. Under the assumption that the number of hurricanes follows a Poisson distribution, an inference is made that the rate parameter is 0.08 hurricanes/yr. Confidence intervals about the parameter are based on the idea that the sample of data at hand is just one possible sample from an infinite pool of data. By considering the process of data collection, a 95% confidence interval is one which contains the true value of the parameter in 95% of the confidence intervals created assuming a large number of samples. Bayesian statistical inference is an alternative approach in which all forms of uncertainty are expressed and quantified in terms of probability. The Bayesian approach to inference was introduced into the climate and related communities by Epstein (1985). Its usefulness as a fundamental strategy for solving problems in weather and climate research is advocated in Berliner et al. (1998). Here again we assume the number of hurricanes to affect New Orleans over a given length of time follows a Poisson distribution having an unknown rate parameter lambda. A probability distribution is used to represent our 3

belief about this parameter. The probability distribution, called the prior, reflects our knowledge about the rate of New Orleans hurricanes before the sample of reliable years is examined. Then, after examining the sample of 100 years of reliable data, our opinions about the rate likely will change. In a Bayesian analysis, a set of observations is seen as something that changes your opinion, rather than as a way to determine ultimate truth. Bayes rule is the recipe for computing the new probability distribution for the rate, called the posterior, based on knowledge of the prior probability distribution and the reliable data. Inferences about the hurricane rate are made by computing summaries of the posterior distribution. Example Consider the annual U.S. hurricane rate. A U.S. hurricane is a tropical cyclone that makes landfall in the United States at hurricane intensity (33 m/s). Before the Galveston hurricane tragedy of 1900, the set of annual counts of U.S. hurricanes is imperfect. Some storms are likely to have gone undetected. For others, the intensity at landfall is uncertain. A classical analysis of the data would likely disregard the available 19th century counts. In contrast, a Bayesian approach keeps the earlier records but uses probability to express the uncertainty associated with them. The 19th century counts are treated as a prior and the uncertainty about the rate during this time is given in terms of a probability distribution. The uncertainty arises from having only a single sample and from the possibility of missing and misspecified storms. Knowledge about the hurricane rate based on 20th century records is also expressed in terms of a probability distribution (likelihood), with the uncertainty arising from having only a single sample. Bayes rule computes the posterior probability distribution from the prior and likelihood distributions (see Figure 1). Here the prior distribution is centered to the right of the likelihood distribution indicating the probability that the second half of the 19th century was, on average, more active than the 20th century. This is useful information. The relatively broad prior distribution indicates the uncertainty and shortness of the unreliable period. The likelihood distribution is narrower and centered to the left of the prior. 4

Combining the prior and likelihood results in a posterior distribution that represents the best information about the annual hurricane rate. The expected rate (mean rate) is 1.68 hurricanes per year with a 95% confidence interval of (1.47, 1.90). The posterior distribution has flatter tails representing the fact that there is greater precision on the posterior hurricane rate. The imperfect 19th century records improve the precision on the rate estimate by 16% as measured by inverse of the interquartile distances when comparing the likelihood with the posterior distributions. Summary The underlying principle is the accumulation of evidence. Evidence may include historical or paleo-data that, by their very nature, are incomplete or fragmentary. The scientific philosophy employed in the fields of archaeology and paleontology (also, forensics, geology, etc) stresses the advantages of cumulative evidence. The principle of the uniformity of nature together with at least a partial understanding of modern processes provides a logical basis for using fragmentary data to improve our understanding of the past. In palaeoanthropology, for example, our understanding about human evolution is largely based on fragmentary evidence from fossil skulls or bones that have survived the taphonomic and preservation processes. In essence, the Bayesian approach provides a mathematical rule explaining how to update our existing beliefs in light of new evidence. This is particularly germane to global change studies (Berliner et al. 2000a, Hasselmann 1998, Leroy 1998, Varis and Kuikka 1997, Hobbs 1997). The Bayesian framework reaches well beyond the single parameter model described above. For instance, hierarchical (conditionally specified) Bayesian spatial models provide a method for handling the spatial dependency inherent in geographic phenomena (Royle and Berliner 1999, Kolaczyk and Huang 2001, Gotway and Young 2002) and Bayesian dynamical models are capable of explicitly accounting for uncertainty in time-series data (Berliner 1996, Lu and Berliner 1999, Berliner et al. 2000b, Elsner and Jagger 2004). Moreover, classical space-time models require stationarity which assumes fixed model parameters and observational variances. A model with good fit in the sample data may 5

perform poorly if the parameters are evolving. Hierarchical Bayesian space-time models provide a more flexible method for the analysis of non-stationary environmental data (Wilke et al 1998, 2001). Thus, in an uncertain world, imperfect data need not be ignored. References Berliner, L. M. 1996. Hierarchical Bayesian time series models. In Maximum Entropy and Bayesian Methods, K. Hanson and R. Silver (Eds.), Dordrecht: Kluwer, 15 22. Berliner, L. M., J. A. Royle, C. K. Wikle, and R. F. Milliff. 1998. Bayesian methods in the atmospheric sciences. In Bayesian Statistics 6, J. M. Bernardo, J. O. Berger, A. P. Dawid, and A. F. M. Smith (Eds.), Proceedings of the Sixth Valencia International Meeting, June 6-10, 1998 (Oxford Science Publications). Berliner, L. M., R. A. Levine, and D. J. Shea. 2000a. Bayesian climate change assessment. Journal of Climate 13:3805 20. Berliner, L. M., C. K. Wilke, and N. Cressie. 2000b. Long-lead prediction of Pacific SSTs via Bayesian dynamic modeling. Journal of Climate 13:3953 68. Elsner, J. B., and B. H. Bossak. 2001. Bayesian analysis of U.S. hurricane climate. Journal of Climate 14:4341 50. Elsner, J. B., and T. H. Jagger. 2004. A hierarchical Bayesian approach to seasonal hurricane modeling. Journal of Climate 17: 2813 27. Epstein, E. S. 1985. Statistical Inference and Prediction in Climatology: A Bayesian Approach, Meteorological Monographs 20 (42), American Meteorological Society, 199 pp. Gotway, C. A., and L. J. Young. 2002. Combining incomplete spatial data. Journal of the American Statistical Association 97:632 48. 6

Hasselmann, K. 1998. Conventional and Bayesian approach to climate-change detection and attribution. 124:2541 65. Quarterly Journal of the Royal Meteorological Society Hobbs, B. F. 1997. Bayesian methods for analyzing climate change and water resource uncertainties. Journal of Environmental Management 49:53 72. Kolaczyk, E. D., and H. Y. Huang. 2001. Multiscale statistical models for hierarchical spatial aggregation. Geographical Analysis 33:95 118. Leroy, S. S. 1998. Detecting climate signals: Some Bayesian aspects. Journal of Climate 11:640 51. Lu, Z. Q., L. M. Berliner, 1999. Markov switching time series models with application to a daily runoff series. Water Resources Research 35:523 34. Royle J. A., and L. M. Berliner. 1999. A hierarchical approach to multivariate spatial modeling and prediction. Journal of Agricultural Biological and Environmental Statistics 4:29 56. Varis, O., and S. Kuikka. 1997. A Bayesian approach to expert judgment elicitation with case studies on climate change impacts on surface waters. Climatic Change 37:539 63. Wilke, C. K., L. M. Berliner, and N. Cressie. 1998. Hierarchical Bayesian space-time models. Environmental and Ecological Statistics 5:117 54. Wilke, C. K., R. F. Milliff, D. Nychka, and L. M. Berliner. 2001. Spatiotemporal hierarchical Bayesian modeling: Tropical ocean surface winds. Journal of the American Statistical Association 96:382 97. 7

Figure 1. Probability distributions for the annual rate of U.S. hurricanes. A U.S. hurricane is a tropical cyclone that makes at least one landfall in the United States at hurricane intensity (33 m/s). (a) The prior and likelihood distributions. The prior distribution is determined from a bootstrap procedure on the annual counts over the period 1851 1899 [see Elsner and Bossak (2001)]. The likelihood distribution is determined from data over the period 1900 2000. (b) The posterior distribution is determined from the prior and likelihood distributions using Bayes rule. 8

0.04 (a) Probability Likelihood 0.02 Prior 0.00 1.0 1.5 2.0 2.5 Rate of U.S. Hurricanes Per Year 0.04 (b) Probability Posterior 0.02 0.00 1.0 1.5 2.0 2.5 Rate of U.S. Hurricanes Per Year Figure 1: Probability distributions for the annual rate of U.S. hurricanes. A U.S. hurricane is a tropical cyclone that makes at least one landfall in the United States at hurricane intensity (33 m/s). (a) The prior and likelihood distributions. The prior distribution is determined from a bootstrap procedure on the annual counts over the period 1851 1899 [see Elsner and Bossak (2001)]. The likelihood distribution is determined from data over the period 1900 2000. (b) The posterior distribution is determined from the prior and likelihood distributions using Bayes rule. 9