disaster data informatics for situation awareness
Post on 16-Jul-2015
947 Views
Preview:
TRANSCRIPT
Disaster Data Informatics for Situation Awareness
Ashutosh Jadhavashutosh@knoesis.org
Ohio Center of Excellence in Knowledge Enabled Computing (Kno.e.sis)
Wright State University, Dayton, OH
Disaster Data Informatics for Situation Awareness Expedite decision making process in the disaster situation by identifying useful/actionable information from social media
1. Informativeness Analysisa. Identify information rich tweet messages (filtering noisy tweets)
based on variety of analysis
2. Classifying information rich messages a. People at the disaster site, suffering people asking for helpb. Global response about the disaster (opinions, comments, news
etc.)
3. Expedite decision making process and situational awarenessa. Considering (2.a) understand needs at disaster site b. Make connection resource-->needs
Motivation: Information Overload●
● 5,500 tweets per seconds during japanese earthquake and tsunami
***Within a minute of the quake, there were more than 40,000 earthquake-related Tweets. The micro-blogging site said it hit about 5,500 Tweets per second on the quake......
-The New York Times
How to find useful and actionable information quickly from such huge stream
of incoming event data?
Multidimensional dataDimensions Data generated at the
disaster location Data generated
around the worldWho generates the data?(People)
Affected people, NGO volunteers
People not directly involved in the diaster
What data is generated?(Content)
Reports about - current situation, - needs for resources, - medical & other emergencies- complains etc.
-Opinions, concerns, sympathy, desire for help
-Sharing of related news, blogs and other multimedia
How the data is generated?(Network)
- Social media (Twitter, FB)- SMS and Web reports to involved NGOs and government organization
Majorly through social media (Twitter, Facebook, blogs, etc)
Why data is generated?(Intention)
- Seeking for help- Inform current situation, needs etc
Sharing personal view-points on the disaster related incidents
When data is generated(Time)
After the disaster, in recovery and rebuild phase
Mostly after the disaster
Research problem
How can we identify useful/ informative (actionable) information
that can be used toexpedite decision making & situational awareness
in the disaster situation?
Informativeness Analysis - Definition ● Useful/actionable information in the disaster situation
that can help for better and faster situation awareness
Examples messagesWe need tent, cover, rice. Uneted Nation never Help us since the earthquake, we live in Carre-four, Lapot street,
if women and children are victim of rape or other agressions in provisionnal shelter, what number can we call to have fast assistance.
We are still under the sheets. We do not have: Tents, prelates, sanitary articles and household etc. Bastien the city Alix fontamara 27
we don't have some water in the delmas camp 40b
We need tent indelmas 18 because we don't find nothing in the area.
How can we find help and food in fontamara 43 rue menos
A father, whose wife passed away, and has two children who need medical attention. One child has a broken arm, and he is afraid of infection
Multidimensional dataDimensions Data generated at the
disaster location Data generated
around the worldWho generates the data? (People)
Affected people, NGO volunteers People not directly involved in the disaster
What data is generated?(Content)
Reports about - current situation, - needs for resources, - medical & other emergencies- complains etc.
-Opinions, concerns, sympathy, desire for help
-Sharing of related news, blogs and other multimedia
How the data is generated?(Network)
- Social media (Twitter, FB)
- SMS and Web reports to involved NGOs and government organization
Majorly through social media (Twitter, Facebook, blogs, etc)
Why data is generated?(Intention)
- Seeking for help- Inform current situation, needs etc
Sharing personal view-points on the disaster related incidents
When data is generated(Time)
After the disaster, in recovery and rebuild phase
Mostly after the disaster
Data set● Social Networking Messages
○ Twitter, Facebook
● News articles○ News websites, external links from tweets, FB status
● NGO messages○ Ushahidi messages/reports
● Mobile messages○ SMS
Informativeness Analysis
Content Analysis● Structure and syntactic analysis● Linguistic analysis● Text analysis● Metadata Analysis
People Analysis● Author profile description● Social connectivity ● Activity level● Author credibility/influence
News Analysis● Content analysis● Social share analysis● URL credibility● Alexa analysis
Semantic Analysis● Content annotation using disaster domain model
considering: entities mentioned, needs, resources, location, organizations, people, disaster type etc.
Content Analysis● Structure and syntactic analysis
○ Message length○ Number of words, special characters, slags, dictionary words
● Linguistic analysis○ Number of nouns, verbs, adverbs, adjective○ POS patterns
● Text analysis
○ N-gram analysis○ TF_IDF statistics○ Entities (dbpedia/ontology)
● Metadata analysis○ Publish time○ Location (explicit and implicit)
People Analysis ● Author profile description
○ Profession○ Demographic information (age, gender, location)
● Social connectivity ○ Number of follow-followers
● Activity level○ Number of tweets○ Number of tweets "on topic"
● Author credibility/influence ○ Klout ○ SocialMatica○ Peer index
News Analysis● News and other event related stories are generally linked in many
of the event related messages (tweets, etc.) primarily ○ Message size limitation (140 characters for Twitter)○ Bringin external authoritative context
● Analyzing news and other event related stories plays a crucial role in event analysis
Many news stories about the event ■ which news stories to focus on?■ how to extract useful and actionable information
nuggets from these news stories ?
News Analysis
Content Analysis- Structure and syntactic analysis- Linguistic analysis- Text analysis- Metadata Analysis
Social share analysis- Number tweets, retweets- Facebook share, like, comments, recommendations- Google plus, LinkedIn shares
URL credibility - Google page rank- Local credibility (?)
Alexa analysis (Alexa is a web information
company)
- Alexa global and country rank- Alexa url authority - Alexa url & subdomain mozRank- Alexa page & domain authority
Semantic Analysis● Content annotation using disaster domain model
considering variety of entities mentioned (DBPedia)○ needs, resources, location, organizations, people,
disaster type etc.
Semantic Disaster Model*** Reuse/ (formalise and build) disaster domain model considering:
Disaster type Earthquake, floods, terror attack (disaster type will help us for better understanding of needs)
Needs Model of basic human needs needs in disasters like food, water, medicines, shelter, etc
ResourcesModel of resources which can satisfy some need like need: thirsty -> resource: water, fruit juice, need: hungry -> resource: food etc.
Location Location of incidents, geo-location data
Organization Involved government and non-government organizations
People & social role
Model of people base on gender, age group, role (mother, father, son, etc.) (This can be help in understanding/reasoning needs like if there is mention of mother and baby then need may be milk)
top related