Hurricane Forecasting Using Regression Modeling

Hurricane Forecasting Using Regression Modeling Yue Liu, Ashley Corkill, Roger Estep Group 10

Overview v Introduction v Choosing Predictors v Modeling The Hurricane Data v Conclusion and Results Images Courtesy of NASA

Why Do We Forecast Hurricanes?

Photograph Courtesy of Vincent Laforet/The New York Times

Photograph Courtesy of NOAA

v Hurricanes are one of nature s most powerful forces. Powerful winds and storm surge can put millions at risk. v Even after landfall, hurricanes and tropical storms can produce tornadoes and deadly inland flooding. v Forecasting potentially deadly storms help protect lives and livelihoods. Photograph Courtesy of Jason Cohen

v Over the past two centuries, tropical cyclones have been responsible for the deaths of about 1.9 million people worldwide. v Large areas of standing water caused by flooding lead to infection, as well as contributing to mosquito-borne illnesses. v Crowded evacuees in shelters increase the risk of disease propagation. v Tropical cyclones significantly interrupt infrastructure, leading to power outages, bridge destruction, and the hampering of reconstruction efforts. v On average, the Gulf and east coasts of the United States suffer approximately US $5 billion in cyclone damage every year. Majority (83%) of tropical cyclone damage is caused by severe hurricanes, category 3 or greater.

How Do We Tackle This Problem? The best solution is to be prepared for a hurricane. This means anticipating tropical storms. We have historical data, and a number of suspected predictors. The first step is to determine which of these predictors are most important. From there, we can estimate the hurricane count. Our priority is the hurricanes that make landfall on the eastern coast of the United States.

Our Region of Interest Note that this does not include any of the Caribbean Islands, Central America, or South America. Image Courtesy of RPI

Regression Analysis and Linear Modeling Poisson Distribution Hurricane count is always a positive integer, whose values are independent of each other. This lends itself well to the Poisson distribution. So the general form for this model will be Ln(μ) = X1B1+X2B2+ +XnBn+X0 Where Xi is a predictor of the hurricane count and Bi is a real coefficient.

Choosing Our Variables v We used a combination of intuition and previously discussed methods to choose the variables of our model out of the 32 possible predictors. v Data was organized by on-season (June1 November 30) and off-season (December 1 May 30). v To produce a more simplified model and still take in sufficient information, data was averaged over the 5 month period from November March. v Step AIC was used to get a rough case of our variables in R, using the seasonal division we had set up.

##Step AIC using 5 month average## setwd("c:/users/ashley/dropbox/yeah HURRICANE PROJECT!!") covariates.avg=read.csv(file.choose()) library(mass) temp.glm = glm(count~.,data=covariates.avg, family=poisson) AIC.yeah = stepaic(temp.glm,k=log(nrow(covariates.avg)),directon="both") This gave us the predictor set ammst.csv, dm.csv, limsta.csv, nino12a.csv, sri.csv

Testing and Trials v From here, we did a calculation using the five months evaluated separately, which gave the same predictor set. v However, an evaluation of all months gave the predictor set amon.csv, ggst.csv, nino3a.csv, qbo.csv, tna.csv, whwp.csv v Here, we decided to apply the Cross-Validation method in order to compare these models, and to derive other potential models.

Our general form for CV is: temp.glm1 = glm(count~cov1.csv + + covn.csv,data=covariates.avg, family=poisson) temp.glm2 = library(boot) set.seed(400) cost <- functon(y,yhat,eps=0.0001) mean((log(y+eps) - log(yhat+eps))^2) cv.out1 = cv.glm(covariates.avg, temp.glm1, cost, K=10)$delta[1] cv.out2 = print(c(cv.out1,,cs.outn))

Putting Data into GLM > glm(formula = count ~ ammsst.csv + dm.csv + limsta.csv + nino12a.csv, family = poisson, data = covariates.avg) Call: glm(formula = count ~ ammsst.csv + dm.csv + limsta.csv + nino12a.csv, family = poisson, data = covariates.avg) Coefficients: (Intercept) ammsst.csv dm.csv limsta.csv nino12a.csv 1.31995 0.09731-0.43185-0.15702-0.03682 Degrees of Freedom: 755 Total (i.e. Null); 751 Residual Null Deviance: 1581 Residual Deviance: 1521 AIC: 3772 Just one example of our results using GLM. Unfortunately, plugging in the data from 2011 gave us an incorrect prediction each time.

Something Works! v Errors: It was at this point that we realized that we were using 2011 data to predict 2011 hurricanes, which will obviously lead to skewed data. v We removed 2011 data and ran a new step AIC on the adjusted dataset. Correlation analysis from the pairs() function helped to confirm our results. v The best covariate set was determined to be amon.csv, dm.csv, limhaw.csv, mei.csv, ngst.csv, nino34a.csv, nino4a.csv v These covariates were able to predict a reasonable number of hurricanes for 2013 and predicted the correct number of hurricanes making landfall in 2011.

glm(formula = count ~ amon.csv + dm.csv + limhaw.csv + mei.csv + ngst.csv + nino34a.csv + nino4a.csv, family = poisson, data = covariates.avg) Deviance Residuals: Min 1Q Median 3Q Max -2.64599-1.08117-0.08738 0.60592 2.83641 Coefficients: Estimate Std. Error z value Pr(> z ) (Intercept) 1.409452 0.103576 13.608 < 2e-16 *** amon.csv 2.828253 0.560791 5.043 4.57e-07 *** dm.csv -0.563010 0.206211-2.730 0.00633 ** limhaw.csv -0.867449 0.326988-2.653 0.00798 ** mei.csv 1.158446 0.367110 3.156 0.00160 ** ngst.csv -0.005826 0.002595-2.245 0.02479 * nino34a.csv -1.901833 0.487181-3.904 9.47e-05 *** nino4a.csv 0.955655 0.318822 2.997 0.00272 ** Null deviance: 131.263 on 60 degrees of freedom Residual deviance: 94.815 on 53 degrees of freedom AIC: 291.15

Conclusion and Results Yi~Possion(μi) The final model is: Yi=1.409+2.828X1-0.563X2-0.867X3+1.158X4-0.006X5-1.902X6+0.956X7 Variable Number X 1 X 2 X 3 X 4 X 5 X 6 X 7 Variable Name Atlantic multidecadal Oscillation Atlantic Dipole Mode LIMHAW Multivariate ENSO Index North-hemisphere monthly Land-Surface Air and Sea- Surface Water Temperature Anomalies East Central Tropical Pacific SST Central Tropical Pacific SST

Prediction Based on our model, we predict that eight or nine tropical storms will make landfall on the east coast in 2013. Image Courtesy of NOAA However, we only consider the linear regression model. Further regression diagnostics and remedies might be helpful in getting a more accurate answer.