utilize social media and big data to build a real-time ... itw_0625keynote...example: use twitter...

Post on 04-Sep-2020

1 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Dr. Ming-Hsiang (Ming) Tsou 鄒明祥

Email: mtsou@mail.sdsu.edu, Twitter @mingtsou Director of the Center for Human Dynamics in the Mobile Age

Professor, Department of Geography , San Diego State University

Utilize Social Media and Big Data to Build a Real-time Disaster Management System

The 2019 International Training Workshop for Natural Disaster Reduction - Applying Big Data and Social Media for Disaster Risk Reduction and

Emergency Preparedness: Workshop #2 on Social Media: June 25, 2019, Taipei. Taiwan.

San Diego State University San Diego, California

This image is provided by Bill Clayton during

the 2003 wildfire in San Diego. (map.sdsu.edu).

2007 San Diego Wildfires

Wildfires

The Center for Human Dynamics in the Mobile Age

Dr. Brian Spitzberg

Dr. Jean Mark Gawron

Dr. Michael Peddecord

Dr. Heather Corliss

Dr. Jay Lee

Dr. Xuan Shi

Dr. Xinyue Ye

NSF Projects (CDI, IBSS, IMEE) + NIH pilot

HDMA Center Faculty

Dr. John Elder

Dr. Piotr Jankowski

Dr. Lourdes Martinez

Dr. Atsushi Nara

Dr. Eric Buhi

Dr. Joseph Gibbons

Graduate Students

Chanwoo Jin

• Nana Luo

• Haihong Huang

• Jeff Yen

• Stefany Pickett

Dr. Ming-Hsiang Tsou (Director)

Dr. Bruce Appleyard

Dr. Xianfeng (Terry) Yang

Dr. Sahar Ghanipoor Machiani

Dr. Caroline A. Thompson

Jaehee Park

Dr. Eyal Oren

What is “Human Dynamics”? ( 人文動力學)

Smart Phones - 2007 (The Mobile Age)

The most important scientific instrument in the 21st Century.

(2014 Sales: 1.2 billion units)

Human Dynamics -- is a transdisciplinary research field focusing on the understanding of dynamic patterns, relationships, narratives, changes, and transitions of human activities, behaviors, and communications.

Animated Image created by the HDMA Center (Hao Zhang).

Visualizing Human Dynamics with the Geo-tagged Social Media (Tweets)

How can Human Dynamics and Social Media Facilitate Emergency Preparedness and Disaster Responses? 1. Understand/Simulate human behaviors, actions, and human

movements (paths) before/after the evacuation period.

2. Detect potential and real-time ground truth problem (traffic jams, shelter locations, the needs of food and water in the field, real-time monitoring, real-time communication, etc.)

3. Effective disaster responses and evacuation procedures (Intelligent Spatial Decision Support Systems)

Place Time

Social

Media

Data

Geography (place and time) is the KEY for Understanding and Integrating Social Media and Big Data

(modified from Tsou and Leitner, 2013) KDC (Knowledge Discovery in Cyberspace) framework

Tsou, M. H. and Leitner, M. (2013). Editorial: Visualization of Social Media: Seeing a Mirage or a Message? In Special Content Issue: "Mapping Cyberspace and Social Media". Cartography and Geographic Information Science. 40(2), pp. 55-60. DOI: 10.1080/15230406.2013.776754

Information Synthesis

• Synthesis literally refers to a combination of two or more entities together to form something new.

• Similar to “Transdisciplinary” research.

• 1 + 1 + 1 = ?

• Geography (Spatial connection) is the glue to combine multiple information and enable information synthesis.

Greater than 3 3 x 3 = 9 (After Synthesizing)

Research Showcase #1:

NSF IMEE project: Integrated Wildfire Evacuation Decision Support System (IWEDSS) 2016 - 2019

https://decisionsupport.sdsu.edu/

$465,189 Ming-Hsiang Tsou mtsou@mail.sdsu.edu (Principal Investigator) Atsushi Nara (Co-Principal Investigator) Sahar Ghanipoor Machiani (Co-Principal Investigator) Xianfeng Yang (Co-Principal Investigator)

Utilize Big Data and GIS models for Disaster Preparedness

Hourly-based Population Density Estimation (Comparing Weekdays and Weekends: (Hao Zhang and Ming) “Unique Twitter User Population-Density”) within the Census Blocks and LandScan Grids (Using 2015 Geo-tagged Tweets in San Diego county (after cleaning process).

Hourly Population Estimation

Hourly Twitter User Density Change in Downtown (Units: Census Blocks)

Weekday (Left) vs. Weekend (Right).

Visualizing Dynamic Spatiotemporal Patterns

Estimated Population (in each polygon)

= [Unique Twitter User Numbers in each polygon]

x [Temporal Variation Factor]

x [Spatial Variation Factor]

Dynamic Population Model

18:00 T-value = 1

8:00am T-value = 1.97

Estimated Population (in each polygon)

= [Unique Twitter User Numbers in each polygon]

x [Temporal Variation Factor]

x [Spatial Variation Factor]

Dynamic Population Model

We utilized dasymetric mapping technique to redistribute the unique Twitter user population based on the ratio of average census population and the average hourly unique Twitter user population in each type of land use categories (Ten types). The goal is to refine the population density maps based on different types of land use data (residential areas, industrial, commercial areas, agricultural, etc.) and census data.

Dasymetric Mapping Result (2016)

Census Data Mapping Result (2010)

Dynamic Population Estimate Model – Validation

• Obtained GPS quality location data (Air Sage) during Lilac Fire in Dec. 2017 • Unique smartphone device density aggregated at 10m x 10m grids per hour • Compare model estimates - pre, during, & post evacuation

Aggregated at 500m x 500m grids

Create Evacuation Plan with Transportation Models

• Determine the impact areas;

• Predict the spread of wildfire over time (Wildfire Simulation Software);

• Determine the evacuation risk zones (ERZ) using TAZ zone (A traffic analysis zone).

• Optimize the evacuation time of each ERZ.

Figure 4. A map of San Diego County Wildfire Evacuation Plan at 3:30 AM, October, 25

of 2007.

Estimate Evacuation rates for different zones (TAZ zones).

Evacuation Decision-Making model using social media data

How to Estimate Evacuation Rate?

Proposed Bayesian Network for Evacuation Decision

IWEDSS Technological Framework

An integrated wildfire evacuation decision support system (IWEDSS).

Prototype Design and Implementation

Waze Connected Citizens Program https://wiki.waze.com/wiki/Connected_Citizens_Program

APIs to get real-time Waze data: • Waze Alerts: Road_closed, Weather Hazard, Jam,

Accident, • Waze Jams: Lines with speed update.

Research Showcase #2:

Tracking Human Mobility Patterns during the Hurricane Matthew 2016.

Han, Su Yeon, Ming-Hsiang Tsou, Elijah Knaap, Sergio Rey, and Guofeng Cao. "How Do

Cities Flow in an Emergency? Tracing Human Mobility Patterns during a Natural

Disaster with Big Data and Geospatial Data Science." Urban Science3, no. 2 (2019): 51.

Tracking the movement of “Evacuation Zone Residents” (who tweets before/after evacuation order inside these zones) during Hurricane Matthew – with 32,735 tweets (in South Carolina), 33,019 tweets (Georgia), and 63,642 tweets (Florida).

Identify real users and residents

South Carolina residents move to Atlanta? (similar to Weekend movement patterns?)

Our estimation of evacuation rates: 35%, which is similar to Cutter et al.’s estimation, (37% versus 35% in 2011), but considerably lower than the 56% reported by Dow and Cutter in 2002 [47].

We found that evacuees are likely to travel 200–400 km for Hurricanes Evacuation, but this distance might not be enough to find safe destinations for those living in the Florida Keys, a string of islands stretching about 200 kms.

During the Evacuation Period (10/3 – 10/8) Red lines: Move-Out Green lines: Move-In

After the Evacuation Period (10/8 – 10/13)

Dr. Zhenlong Li Director, Geoinformation and Big Data Research Laboratory

Assistant Professor, Department of Geography

University of South Carolina

zhenlong@sc.edu, http://gis.cas.sc.edu/gibd

Other Relevant Research in U.S.

Dr. Zhenlong Li zhenlong@sc.edu,

http://gis.cas.sc.edu/gibd

Social Networks Real-world decision making

Dr. Zhenlong Li zhenlong@sc.edu, http://gis.cas.sc.edu/gibd

Apply A.I and Deep Learning Algorithms for Disaster related photos.

Research Showcase #3:

Real-time Situation Awareness Viewer for Monitoring Disaster Impacts Using Location-Based Social Media Messages (Twitter). ( CBS video).

San Diego County: Office of Emergency Services (OES)

ReadySD Social 1.1 Google Playstore: Apple iOS: Version 1.1

Users can register as participant on the first page.

Link : Click link button and it will direct you to the original tweet.

Share : you can share the message using text messaging, email or facebook. You will be able to share it using any type of social media installed in your phone.

Modified the ReTweet Message for Tracking Diffusion. of Messages: Please help us spread this info is automatically included at the beginning of the sentence. So the Retweet will become a new tweet and trackable.

We are tracking each button clicked

RT (trackable)

Level1 Other Users or Volunteers

Level2

RT other agencies (Not Original)

New Tweets (Original Info)

OES Tweets RT (NO trackable)

ReadySD (trackable)

ReadySD (Shared by Username, trackable)

RT (NO trackable) Only Track Other’s RT#

RT (NO trackable)

RT (trackable)

RT (trackable)

RT (NO trackable)

RT (NO trackable)

RT (trackable)

RT (trackable)

RT (trackable)

Implementation Challenges: Need to Partner with local volunteer groups. (We only recruited around 300 users in three years).

Geo-Targeting Data Collection (Twitter APIs)

Analysis

Web-based Visualization

Filter Machine Learning

Trend Analysis

Spatial Analysis

SMART Dashboard

Application Programming Interfaces (API)

SMART Dashboard

Hurricane Irma in Tampa (Gas Shortage) http://vision.sdsu.edu/ec2/smart2/Hurricane-Irma-Tampa?userID=hdma

Shortage of “Gas” (peak on 9/01 (Gas price up), 9/05 (out of Gas) – many Not-RT (original) tweets (green lines) - lowest on 9/10, then come back again….

Blue line: Include RT and original tweets Green line: Only original tweets (without retweets).

Hurricane Irma in Tampa (Power Problems) http://vision.sdsu.edu/ec2/smart2/Hurricane-Irma-Tampa?userID=hdma

Problems of “Power” (peak on 9/10 - Power Blackout in Tampa), many Not-RT (original) tweets (green lines) - lowest on 9/10, then come back again…. (real time monitor the situation --- every 10 mins).

Blue line: Include RT and original tweets Green line: Only original tweets (without retweets).

Geo-tagged Tweets (in GeoViewer) with keyword “Drunk” in San Diego Spatial Analysis

Comparing Spatial Cluster of DUI Records (Red dots, Left side) and Tweets with “Drunk” keyword (Right side).

Similar Spatial Pattern? (Dynamic monitoring in REAL TIME?)

GeoViewer (Search “drunk” for two months) GIS Map with DUI Records

Big Data Fusion (Integration)

Challenges in Social Media Research 1. Data Sampling and Representative (Social Media Users are Young and Biased?) 2. Data Noises (bots and advertisements , Fake News) 3. No control in the Data Source or Data APIs (Depending on these social media companies: Twitter, Facebook, and Instagram, etc.).

Dr. Zhenlong Li zhenlong@sc.edu, http://gis.cas.sc.edu/gibd

Noises and Bots in Social Media

Detect robot tweets or advertisement tweets (noises) in geo-tagged tweets by examining the “source” metadata field. The

portion of data noises is significant (29.42%) in our case study.

The number of Tweets (in San Diego) produced by different platforms during the month of November, 2015. In the [Source] filed in tweet JSON documents.

Who are the Users? Humans or robots (bots)?

Use SMART dashboard to track “E-cigarette” topics

High Peak on Feb 11, 2016 (Why?)

From to 11114 – 9561 = 1553 (Mummy or Ghost Twitter Accounts?) for Advertisement?

1,553 Twitter Accounts

Said the Exact Sentence! In One Day (2/11/2016),

Are They “Mummies and Ghosts (Zombie) ” ?

Who are they? How they post the messages?

https://techcrunch.com/2019/06/18/twitter-will-remove-location-tagging-in-tweets-citing-lack-of-use/

Bad News about Geo-tagged Tweets.

Researchers have no control in the Data Source or Data APIs.

Example: Use Twitter Search API to search for keyword “HIV test” or “HIV testing” Only 1% - 7% of Tweets have X, Y GEO-coordinates (from GPS or Geo-tagged). But 50% - 60% Tweets have city-level locations provided by their user profiles. 90% Tweets have Time Zone (limited spatial meaning)

What we can find geospatial information in Twitter Data?

Human Dynamic in the Mobile Age (HDMA)

Thank You Q & A

Dr. Ming-Hsiang (Ming) Tsou mtsou@mail.sdsu.edu

Twitter @mingtsou

http://humandynamics.sdsu.edu/

Funded by

• NSF Interdisciplinary Behavioral and Social Science (IBSS) Program, Award #1416509 ($1

million (PI: Tsou, 2014-2019). “Spatiotemporal Modeling of Human Dynamics Across Social

Media and Social Networks”. http://socialmedia.sdsu.edu/

• NSF IMEE program. Award#: 1634641, Integrated Stage-based Evacuation with Social

Perception Analysis and Dynamic Population Estimation. $449,202, PI: Tsou, 2016-

2019. http://decisionsupport.sdsu.edu

top related