medical social web and event detection - kbs - kbs€¦ · event detection vs. information...

36
1 Medical Social Web and Event Detection Dr. Kerstin Denecke

Upload: others

Post on 28-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

1

Medical Social Web and Event Detection

Dr. Kerstin Denecke

Page 2: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Agenda

• The Medical Social Web

• Event Detection

• Public Health Event Detection

• Evaluation of Event Detection Systems

• Overview on the papers

08/12/10Kerstin Denecke 2

Page 3: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

08/12/10Kerstin Denecke 3

Forum

Weblogs

Open Access Media

Multi Media Content

Twitter

@GoethesMatrix Leitungswasser meiner Uni. Wir hatten letztens den Noro-Virus hier, vielleicht ist das ja ein Symptom?

The Medical Social Web – What is that?

Page 4: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Challenges of Social Media Data

• Huge amount of data available

� Irrelevant information vs. relevant

• Use of specific language

� Medical language vs. consumer health vocabulary

� Common language

Doctors have concluded thata body temperature above102 does indeed mean thatyou have Bieber Fever .

I got a fever sore throatand headache

irrelevant relevant

Page 5: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Challenges of Social Media Data

• Subjective content

� Information vs. opinion

• Different styles of writing, noise

� Abbreviations, writing errors, emoticons..

Asthma Problem Due to Allergy: A common indoor environmental asthma trigger is the mold that might be present in...

Thankful dat I got no ailments other den arthritis. Sum ppl got asthma , cancer, aids, badbreath, fatness, etc.

Page 6: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Event Detection

• Definition / Problem Statement

• Overview on approaches

• Public health event detection - challenges

08/12/10Kerstin Denecke 6

Page 7: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Where is event detection important?

From: D.B. Neill, W-K. Wong: Tutorial on Event Detection, KDD 2009,

)

Page 8: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Where is event detection important?

Page 9: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Where is event detection important?

Video Surveillance

Page 10: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Events: A journalist‘s perspective

1. What happened?2. When did the event occur?3. Who was involved?4. Where took it place?5. How did it happen?6. What is its impact,

significance, consequence?

Page 11: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

What is an event?

“An event is a specific thing that happens at a specific time and place along with all necessary preconditions and unavoidable consequence.”

C. Cieri et al.: Corpora for topic detection and tracking. In: Topic detection and tracking, 2002, pp. 33-66

Retrospective eventspreviously unidentified events from accumulated historical collection

New eventsnew events detected from live

feeds in realtime

Rare vs. frequent events

Page 12: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Goals of event detection

• Identify if an event of interest has occurred

• Characterize the event

• Pinpoint the affected subgroup of the data i.e.

� What features describe the event (eg. Spatial area, time duration)?

� What is the severity/magnitude of the event?

• Detect as accurately as possible

• Detect as early as possible

PAHO EOC Situation Report #15 - Cholera Outbreak in Haiti

Page 13: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Event Detection Approaches

• Approaches from the natural language processing community

� Message Understanding Conferences (MUC)

� Topic Detection and Tracking (TDT)

� Automatic Content Extraction (ACE)

• Approaches from the data mining community

� Classification

� Clustering

08/12/10Kerstin Denecke 13

Page 14: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Natural Language Processing Approaches

• Message Understanding Conferences (MUC)

�Information Extraction, template filling

• Topic Detection and Tracking (TDT )

� issues related to detecting and tracking events in broadcast news

�Event refers to topic

• Automatic Content Extraction (ACE )

�an event is an n-ary relation, binding entities of a given type together by an explicit and named concept.

�definition of various types of events, such as: marriage, death, merger

08/12/10Kerstin Denecke 14

� pattern or rule-based approaches

Page 15: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Event Detection Approaches

• Approaches from the data mining community

� Assumption: Document is part of document set containing an eventor not

� Classification

� Clustering

08/12/10Kerstin Denecke 15

Machine learning: Learning characteristics from feature sets

Page 16: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Event Detection vs. Information Retrieval

Information retrieval

- Relies upon user-defined query to specify what is „interesting“

- Finds documents that satisfies an information request

Event Detection (in particular: new event detection)- No knowledge of what events will happen � without specified query- Might look for clues in relevant information sources

Page 17: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Public Health Event Detection

• Definitions

• Overview on approaches

• Current challenges

08/12/10Kerstin Denecke 17

Tens of thousands of people in Haiti arethreatened by a recent Cholera outbreakdespite the UN insisting that the endemicis stabilising.

Page 18: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Definitions

Disease Surveillance

� epidemiological practice by which the spread of disease is monitored in order to establish patterns of progression

Epidemic Intelligence

� complement traditional surveillance systems by incorporating new official and unofficial sources of structured and unstructured information

Page 19: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Public Health Event involves a disease occurrence or death above expected levels for the specific disease at a particular time and place.

Indicator refers to an epidemiological quantity, that is based on clearly defined events, usually cases of a disease or an infection according to a case definition, that are reported in a standardized way.

Signal is generated by a system based on observed data sources whenever a predefined or computed threshold for an indicator is exceeded. It can therefore be considered as a hint to a possiblepublic health event.

Definitions

Page 20: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Motivation

From: D.B. Neill, W-K. Wong: Tutorial on Event Detection, KDD 2009,

Page 21: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Approaches to Disease Surveillance

Page 22: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

= the organized and rapid capture of information about events that are a potential risk to public health

• Information can be rumors or ad-hoc reports transmitted through formal and informal channels

• System rely on the immediate reporting of events and are designed to detect:

� Rare and new events that are not specifically included in indicator-based surveillance.

� Events that occur in populations which do not access health carethrough formal channels.

Event-based Surveillance

Page 23: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Producing signals for a public health event

Norovirus example

Norovirus outbreak (Real event)

Indicator (Multiple instances where of the term norovirus is mentioned)

Signal (Number of mentions exceeds specific threshold)

Event-detection (Public health event)

Page 24: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

• Pattern matching

• Handcrafted-rules

Approaches to event-based EI

• Machine learning

• Automatically created patterns

Page 25: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Existing Event-based Surveillance Systems

Page 26: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

• Technological challenges

� Considering many data sources

� Dealing with noise

� High specificity and sensitivity

• Epidemiological challenges

� Detection time

� Emerging and unknown diseases

Current Challenges in EI

Page 27: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Epidemiological Challenges: Detection Time

Existing event-based services: MedISys, ProMed Mail, news, publications, government websites

Established surveillance systems: SurvNet, ARS,

sentinels

Web 2.0 and user generated sources

2 hours 1 day 3 days� t

Time to event detection, different EI methods

Event occurrence

Page 28: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Evaluation of Event Detection Systems

• General Approach

� Training set, annotated with events

� Test set

� 10-fold-cross-validations

� Measures: Precision, Recall, F-Score

08/12/10Kerstin Denecke 28

� Annotated corpora are only available for events forspecific domains (news)

� Creation of annotated data sets is time consuming

Page 29: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Evaluation

• TREC entity track

� Goal: perform entity-related search on Web data

� http://ilps.science.uva.nl/trec-entity/

• TRECVid Surveillance Event Detection

� goal: support the development of technologies to detect visual events (people engaged in particular activities) in a large collection of streaming video data

� Training data: 150 hours of multi-camera airport surveillancedomain data

� Test corpus

� http://www.itl.nist.gov/iad/mig//tests/trecvid/2010/

Page 30: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Evaluation: Challenges for public health

General challenges

• Active feedback from the users necessary

• Time consuming

• Component evaluation vs. system evaluation

Specific challenges

• No annotated data set for medical social media data

• Sensitivity vs. specificity

• Comparison of various systems

Page 31: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Overview on the papers

1. Statistical Challenges Facing Early Outbreak Detection in

Biosurveillance (G. Shmueli, H. Burkom)

2. Detecting influenza outbreaks by analysing Twitter messages (A.

Culotta)

3. What‘s unusual in online disease outbreak news? (N. Collier)

Kerstin Denecke 08/12/10 32

Topic :

� Public health event detection

� Approaches for social media data

� Evaluation of detection systems

Page 32: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Statistical Challenges Facing Early Outbreak Detection in Biosurveillance (G. Shmueli, H. Burkom)

• Topic: Challenges for applying statistical methods from indicator-

based surveillance to new data

• Key task: combine new data sources with traditional ones to

classify situational awareness

• Challenges:

� Modeling of underlying background knowledge

� Nature of an outbreak

� Evaluation of performance

� Requirements and uses of biosurveillance systems

Kerstin Denecke 08/12/10 33

Page 33: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Detecting influenza outbreaks by analysing Twittermessages (A. Culotta)

• Topic: Use of Twitter data to forecast future influenza rates

• Approach: Simple keyword matching

• Show correlation with U.S. national statistics on disease outbreaks

• Problems and challenges:

� False positives „Bieber fever“, „flu vaccines“

• Classification approach for filtering

Kerstin Denecke 08/12/10 34

Page 34: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

What‘s unusual in online disease outbreak news? (N. Collier)

• Topic: Use of online news to support early alerting

• BioCaster system: Textmining system for monitoring global online

media

• Analysis of various statistical methods for signal generation

• Challenges and problems

• Conclusion: it is non-trivial to relate news counts to the actual

number of cases

Kerstin Denecke 08/12/10 35

Page 35: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

References

• Chinchor N (ed.): Seventh Message Understanding Conference (MUC-7): Proceedings of a Conference Held in Fairfax, Virginia, April 29 - May 1, 1998, http://www.aclweb.org/anthology/M/M98/

• Doddington G, Mitchell A, Przybocki M, et al.: The Automatic ContentExtraction (ACE) Program Tasks, Data, and Evaluation. LREC 2004, Lisbon, Portugal

• Fiscus J, Doddington G: Topic Detection and Tracking Evaluation Overview. In: Topic Detection and Tracking Event-based Information Organization. 2002

• Zhao Q, Mitra P: Event detection and Visualization for Social Text Streams. ICWSM 2007, Boulder, Canada

Page 36: Medical Social Web and Event Detection - KBS - KBS€¦ · Event Detection vs. Information Retrieval Information retrieval - Relies upon user-defined query to specify what is „interesting“

Thank youfor your attention!

Dr. Kerstin DeneckeForschungszentrum [email protected]

Medical Social Web and Event Detection