Twitter s Effectiveness on Blackout Detection during Hurricane Sandy

Similar documents
Superstorm Sandy What Risk Managers and Underwriters Learned

Tracking Hurricane Sandy

DM-Group Meeting. Subhodip Biswas 10/16/2014

Mapping Coastal Change Using LiDAR and Multispectral Imagery

Tropical Revolving Storms: Cuba 2008 By The British Geographer

Peterborough Distribution Inc Ashburnham Drive, PO Box 4125, Station Main Peterborough ON K9J 6Z5

Daily Operations Briefing. Sunday, October 22, :30 a.m. EDT

Preparing for Earthquakes in Dallas-Fort Worth Applying HAZUS to assess shelter accessibility. Crystal Curtis

[CLUB NAME] HURRICANE ACTIVATION PLAN [EXCERPT VERSION]

Detecting Heavy Rain Disaster from Social and Physical Sensors

The Tampa Bay Catastrophic Plan Presentation to CFGIS Users Group FDOT District 5 Urban Offices - Orlando July 30, 2010

Communications and Lessons Learned from

Citizen Science at the. U.S. Geological Survey

Ensuring the Building of Community Resiliency through Effective Partnership December 4, 2017 Melia Nassau Beach Valentino A. Hanna

Daily Operations Briefing. Tuesday, October 24, :30 a.m. EDT

Top 10 Actions a CIO Can Take to Prepare for a Hurricane

PUBLIC SAFETY POWER SHUTOFF POLICIES AND PROCEDURES

Daily Operations Briefing. Monday, October 30, :30 a.m. EDT

Lesson: Don t Wait For the Storm

Hurricane Sandy. Background Information and (first) Disaster Assessment Report. ( as of) Nov, 2 nd 2012

HURRICANES AND TORNADOES

Coping with Disaster: Maintaining Continuity in the Wake of Emergencies

Text Mining. Dr. Yanjun Li. Associate Professor. Department of Computer and Information Sciences Fordham University

Lab 20. Predicting Hurricane Strength: How Can Someone Predict Changes in Hurricane Wind Speed Over Time?

Superstorm Sandy Willis Re s post-event field damage survey preliminary report

Severe Weather Hazards Are Real

Kosciusko REMC Script August 18, 2014

Outage Maps: The MVP of Hurricane Season 2017

Very Dangerous Coastal Storm Sandy October 28 th 31 st 2012

Business Preparedness. Hurricane Preparedness 2018 NFP InForum February 15, 2018

Hurricane Tracking Lab

IS YOUR BUSINESS PREPARED FOR A POWER OUTAGE?

Hurricane Preparation and Recovery. October 11, 2011 Jon Nance, Chief Engineer, NCDOT

Phases of Disaster Response. John Yeaw, Gavin Vanstone, Haochen Wu, Jordan Tyler

Major Event Response Reporting Guelph Hydro Electric Systems Inc. (Guelph Hydro) May 4, 2018 Wind Storm

Aid to Critical Infrastructure and Key Resources During a Disaster. Pete Grandgeorge MidAmerican Energy Company

GC Briefing. Weather Sentinel Hurricane Florence. Status at 5 PM EDT (21 UTC) Today (NHC) Discussion. September 13, 2018

Careful, Cyclones Can Blow You Away!

Daily Operations Briefing. Sunday, October 29, :30 a.m. EDT

Measurement of human activity using velocity GPS data obtained from mobile phones

Deep Thunder. Local Area Precision Forecasting for Weather-Sensitive Business Operations (e.g. Electric Utility)

A Utility s Response and Recovery. Marc Butts Southern Company 03/19/2014

Caesar s Taxi Prediction Services

Hurricanes 1. Thunderclouds. cool, dry air falls. warm, moist air rises

Daily Operations Briefing. Wednesday, October 18, :30 a.m. EDT

Donna J. Kain, PhD and Catherine F. Smith, PhD East Carolina University

Winter Ready DC District of Columbia Public Service Commission

FLOODING. Flood any relatively high stream flow overtopping the natural or artificial banks in a water system.

Intelligence and statistics for rapid and robust earthquake detection, association and location

Hurricane Prediction with Python

Using Remote Sensing Technologies to Improve Resilience

Every year, newspapers from around the world cover huge ocean storms that hit different regions..

Electric Distribution Storm Hardening Initiatives. Paul V. Stergiou Distribution Engineering October 14 th, 2015

The AIR Severe Thunderstorm Model for the United States

Fort Lauderdale s GIS Supports Response to. Hurricane Irma

Hurricane Katrina August 29 th, 2005 Costliest Disaster/One of the 5 deadliest Hurricanes in US History. Total Cost: Approx. $100 Billion in damages O

WEDNESDAY 30 TH AUGUST, :57 p.m. Tropical Storm Irma forms in the Atlantic. Don t let your guard down, always #Be Ready.

Major Hurricane Matthew Briefing Situation Overview

Daily Operations Briefing Saturday, October 1, :30 a.m. EDT

Hurricane Lane. Hawaiian Islands, August By Ian Robertson 1, Ph.D., P.E.

Post-Hurricane Recovery: How Long Does it Take?

Damage and Restoration of Drinking Water Systems Caused by 0206 Tainan Earthquake and Future Mitigation Measures

Generative Clustering, Topic Modeling, & Bayesian Inference

Estimating the Spatial Distribution of Power Outages during Hurricanes for Risk Management

Pacific Catastrophe Risk Assessment And Financing Initiative

A Look Back at the 2012 Hurricane Season and a Look Ahead to 2013 & Beyond. Daniel Brown National Hurricane Center Miami, Florida 24 April 2013

Daily Operations Briefing December 25, 2012 As of 6:30 a.m. EST

City of Punta Gorda Community Emergency Management Plan 2013

Welcome Jeff Orrock Warning Coordination Meteorologist National Weather Service Raleigh

The Bayes classifier

Homework Assignment IV. Weather Exercises

Complete Weather Intelligence for Public Safety from DTN

Daily Operations Briefing. Thursday, October 26, :30 a.m. EDT

How we use social media to communicate weather news

KCC White Paper: The 100 Year Hurricane. Could it happen this year? Are insurers prepared? KAREN CLARK & COMPANY. June 2014

Behavioral Data Mining. Lecture 2

HAZUS-MH: A Predictable Hurricane Risk Assessment Tool for the City of Houston and Harris County

VISUAL EXPLORATION OF SPATIAL-TEMPORAL TRAFFIC CONGESTION PATTERNS USING FLOATING CAR DATA. Candra Kartika 2015

Director, Operations Services, Met-Ed

Daily Operations Briefing. Monday, October 9, :30 a.m. EDT

Unit 5: NWS Hazardous Weather Products. Hazardous Weather and Flooding Preparedness

Tropical Update. 11 AM EDT Wednesday, October 10, 2018 Hurricane Michael, Hurricane Leslie & Tropical Storm Nadine, Caribbean Low (40%)

Wind Tower Deployments and Pressure Sensor Installation on Coastal Houses Preliminary Data Summary _ Sea Grant Project No.

Building a Timeline Action Network for Evacuation in Disaster

Daily Operations Briefing. Saturday, May 13, :30 a.m. EDT

Understanding China Census Data with GIS By Shuming Bao and Susan Haynie China Data Center, University of Michigan

Geo-information and Disaster Risk Reduction in the Hindu Kush-Himalayan region

Daily Operations Briefing Tuesday, March 10, :30 a.m. EDT

Agency Vision and Decision- Maker Needs: A USGS Perspective

OUT OF THE DARKNESS AN ECLIPSE AND A POWER OUTAGE. Or..

Chapter 11: Dealing with Extreme Weather: Hurricanes in the Caribbean Overview Objectives Language Objectives: 1. Vocabulary/Geoterms:

EVALUATION AND VERIFICATION OF PUBLIC WEATHER SERVICES. Pablo Santos Meteorologist In Charge National Weather Service Miami, FL

Tropical Cyclone Sandy (AL182012)

f<~~ ~ Gulf Power April16, 2018

Gaussian Models

GC Briefing. Weather Sentinel Tropical Storm Michael. Status at 8 AM EDT (12 UTC) Today (NHC) Discussion. October 11, 2018

America must act to protect its power grid

Using Nightlight Remote Sensing Imagery and Twitter Data to Study Power Outages

The Saffir-Simpson Hurricane Wind Scale

Transcription:

Twitter s Effectiveness on Blackout Detection during Hurricane Sandy KJ Lee, Ju-young Shin & Reza Zadeh December, 03. Introduction Hurricane Sandy developed from the Caribbean stroke near Atlantic City, N.J., with winds of 80 mph about 8 p.m. EDT Oct. 9, 0. When the hurricane ripped through the Northeast, thousands of buildings and facilities were catastrophically damaged. And beyond the structural damage, Sandy also left.5 million people without electricity mostly in New York and New Jersey[]. It was simply the most recent time that the most populated corridor in the United States experienced massive power outages. Many victims of the power outages had to wait more than two weeks to get it back. The restoration of power outage took long time not only because the extent of damage from Hurricane Sandy far exceeded the destruction caused by previous weather-related events, but also because there was significant time lag for utilities to gather power outage information and to dispatch their limited number of field crews. The duration of power outage restoration depends on several factors such as the main cause of power outage and the number of areas affected. Because each outage is a result of different circumstances, some may take longer to identify and restore than others. For example, if a pole near house or business goes down, it is easy to identify what caused the outage, and utilities can immediately Reza Zadeh provided the raw Twitter data. get to work on the actual repairs. In that case, it typically takes about six hours to put up a new pole. However, during storm-related outages, restoration information may not be immediately available. Sometimes there is much damage that the utilities need to assess through a combination of remote monitoring and customer calls. Even with remote sensing and controlling system, utilities heavily rely on customer calls in the event of large blackout, resulting in time lag in their restoration efforts. During the storm and its aftermath, social media has been critical in spreading important information and mobilizing relief efforts. Some argue that Twitter was more accessible and farreaching than any TV network, with over 0 million Tweets posted during the height of the storm []. In this study, we examined and analyzed the abundant Twitter data to help minimize the damage by future catastrophic blackout events.. Related Work The analysis of natural disaster using Twitter data has been done by Burks et. Al in the study titled Early warning of ground shaking intensity using Tweets for earthquakes and by Sakaki et al. in the study titled Earthquake Shakes Twitter Users Real-time Event Detection by Social Sensors [3], [4]. These paper demonstrates results for giving advanced warning to critical facilities of the magnitude of ground motion intensity by k- nearest neighbors and a generalized linear model on tweet behavior. 3. Obective Our goal in this paper is to improve the outage detection process through the utilization of social media data. This goal consists of two subobectives, both of which involve machine learning using Twitter data: ) to develop an algorithm for early detection of the blackout

locations and ) to pick the optimal locations for the blackout response crews to set up their camps. 4. Methodology The second-costliest hurricane in United States history caused twitters to send more than 0 million tweets about the storm [5]. After filtering to collect the outage-related data, we developed an algorithm to identify tweets which correctly indicate blackout in the tweet origin location. This would significantly reduce time required to collect blackout location information for power utilities. Then, we created clusters of valid outage tweets so that utilities could use the cluster info to dispatch their field crews effectively and efficiently. The process map below (Figure ) shows complete methodology steps. Sandy were selected. Some of the filtered Twitter data are shown in Table below. Table. Sample Twitter Data ID Time stamp Latitude Longitude 0-0-30 :49: 40.650-73.950 0--0 04:39:03 40.768-74.68 3 0-0-30 7:35:30 40.74-74.00 4... Obtaining true blackout data mapping it to the filtered twitter data: First, we obtained a colored map representing blackout regions during the storm (Figure ). Using ArcGIS software, we plotted the filtered Twitter data on NASA SPoRT map. Then, we identified whether each tweet was created in blackout region. Figure 3 shows the plotted Twitter data with outage labeling. Text I have no work for the rest of the week altho Ill have no electricity either #thankssandy Shots in the dark #sandy #blackout #tequila http://t.co/5kvrs85u Still no electricity because of hurricane Sandy Been over 4 hours now. Kitty is sad and bored! Figure. Methodology Process Map 4.. Data Preprocessing 4... Filtering Twitter Data : First, we obtained the raw tweeter data containing all the tweets created in the United States during the storm from 0/30/0 to /5/0. We processed the data containing hurricane sandy to obtain tweets related to blackout of New Jersey. Each tweet line contains time lag between a tweet and the landfall of hurricane Sandy (EDT 0:00: :00, Oct 30, 0) and the latitude and longitude of a tweet, and the content of a tweet. To restrict target area to New Jersey, tweets written at latitude 38 6 N to 4 N, longitude 73 54 W to 75 34 W were chosen. Then, we matched keywords indicating blackout to each tweet line: blackout, power outage, no electricity, and no light. Also, to minimize outliers, we excluded tweets written before the landfall of hurricane. Finally, from the data filtering, overall 5000 lines of target tweets related to blackout of New Jersey by hurricane Figure. NASA SPoRT Map Hurricane Sandy Blackout Figure 3. Plotted Twitter data with blackout labeling

4.. Machine Learning Analysis 4... Naive Bayes classifier using the multinomial event model and the Laplace smoothing To enhance the reliability of filtered blackout data, we built a Naïve Bayes classifier which canvalidate the true power outage tweets. By applying this machine learning, outliers such as tweet lines mentioning news or not related to blackout occurred by Sandy were excluded. The accuracy of the classification by the Naïve Bayes classifier was estimated by comparing with the labeling by a blackout satellite map. Word x i denotes the identity of the i-th word in a tweeter line and y indicates whether the tweeter line informs real blackout or not. Given a training set{( x, y ; i,... m } where x ( x, x,..., x ) ni ( n is the number of words in the i-training example), i The likelihood of the overall probability of tweeters is given as follow: m n i L( φ, φk y 0, φk y ) p( x y; φk y 0, φk y p( y ; φ y) i To maximize of the likelihood of the data, the estimates of the parameters are as below: φ φ m ni { x } k y + i k y { y } n i i+ V m ni { x 0} k y + i k y 0 { y 0} n i i+ V φ y { y } i m 4... K- means clustering algorithm For the efficient allocation of power resources to manage blackouts induced by hurricane, clustering the locations of blackouts based on spatial extent is necessary. Here the k-mean clustering algorithm was applied to clustering the distributed blackout spots and finding optimal locations of power resources as below: Step. Initialize cluster centroids µ, µ,..., µ k R randomly within latitude 38 6 N to 4 N, longitude 73 54 W to 75 34 W where k is the number of power resources. Step. Repeat until the centroids converge: { For every i, set c : argmin x µ where x is a blackout spot. For every, set : m i µ m { c } x i ( i) { c } where m is the total number of blackout spots. } 4. Analysis Result Figure 4 depicts how accurately the Naïve Bayes classifier labels the true power outage tweets. As can be seen, five training sets in different sizes were used to train a model, and the error rate of classification decreased to about 30% when we employed the maximum size of a training set. Although the error rate is acceptable, it implies that there are difficulties in thoroughly distinguishing between the validate tweets and outliers because the context of tweets cannot be understood sorely by the frequency of words. Error rate 0.5 0.4 0.3 0. 0. 0 0 000 000 3000 4000 5000 Number of training samples Figure 4. Error rate of classification of true blackout tweets by the Naïve Bayes classifier, used five different size of training sets. 3

5. Conclusion and Future Works The Naïve Bayes classifier for identification of blackout using Twitter data seems to yield relatively high error rate (30%~40%) regardless of the size of training set. We believe that this high error rates are due to various reasons, such as small sample size, lack of feature word patterns, etc. This could be improved if there were more blackout-relevant hashtags in Twitter, in order to identify true blackout tweets. Also, there could have been errors in labeling true blackout tweets using NASA SPoRT map. Although the map is relatively accurate, the error rate could have improved if we used the actual blackout survey data from FEMA, which was unavailable to public unfortunately. Figure 5. Allocation of power resources by k-mean clustering algorithm. We conducted 000 iterations of the k-mean clustering algorithm for the convergence to 8 clustering centroids as shown in Figure 5. The spatially optimized centroids described in Table can be regarded as the efficient locations to supply power to blackout spots clustered by centroids. If there are a limited number of power resources and limited locations to place them, the k-mean clustering algorithm can result in very feasible clustering by adusting the number of centroids and selecting centroids in restricted coordinates. Table. Location of 8 power resources converged from k-mean clustering. No. Latitude Longitude 40.00-74.48 39.40-75.043 3 40.845-74. 4 40.5-74.8 5 40.009-75.75 6 40.568-74.399 7 40.76-73.947 8 39.40-74.455 During blackout, 8 optimized location of power stations can be designated by K-mean clustering expeditiously based on 504 twitter data. Though we selected 8 locations in our studies, the number of centroids can vary depending on the number of available emergency response teams from utility companies. The methodology we developed in this study can also be applied to other types of disasters in densely populated areas where social media is heavily used. We hope that our study could become a foundation for further related studies in prediction of various types of mega disasters for more rapid response to life-threatening events. 6. Reference: [] Huffington Post, Sandy: An Eye-Opener for the Power Grid http://www.huffingtonpost.com/danielmcgahn/power-grid_b_9554.html [] irevolution, Using Twitter to Map Blackouts During Hurricane Sandy (July 3, 303) http://irevolution.net/03/07/03/using-twitter-tomap-blackouts-during-hurricane-sandy/ 4

[3] Burks, L., Jordan, C., Miller, M., Zadeh, R., (03). Early warning of ground shaking intensity using Tweets, Association for the Advancement of Artificial Intelligence (in review), 6. [4] Sakaki, T., Okazaki, M., Matsuo, Y. (00). Earthquake Shakes Twitter Users Real-time Event Detection by Social Sensors, WWW00, April 6-30. [5] Hurricane Sandy and Twitter http://www.ournalism.org/0//06/hur ricane-sandy-and-twitter/ 5