determining the types of temporal relations in discourse
DESCRIPTION
TRANSCRIPT
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
Determining the Types of Temporal Relations inDiscourse
Leon Derczynski
University of Sheffield
5 March, 2013
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
The Role of Time
Why is time important in language processing?
World state changes constantly
Every empirical assertion has temporal bounds
“The sky is blue”, but it was not always
Without it, naıve knowledge extraction will fail (given anAlmanac of Presidents, who is President?)
By understanding temporal information, you will do betterknowledge extraction.
Overall goal
How do we automatically understand temporal information innatural languages?
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
Temporal Information Extraction
Existing state of the artHow can we categorise types of temporal information?
Events – e.g. occurrences, states
Temporal expressions (timexes) – e.g. dates, durations
Links – relations between pairs of events or times
Supporting texts – e.g. action cardinality, event ordering
We develop and use ISO-TimeML to annotate these entities.Main dataset: TimeBank (about 180 annotated documents)
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
TimeML
Organizers
<EVENT eid="e2120" class="REPORTING">state</EVENT>the
<TIMEX3 tid="t29" type="DURATION" value="P2D"
temporalFunction="false"
functionInDocument="NONE">two days</TIMEX3>of music, dancing, and speeches is
<EVENT eid="e2123" class="I STATE">expected</EVENT>to
<EVENT eid="e13" class="OCCURRENCE">draw</EVENT>some two million people.
<TLINK eventID="e2123" relatedToTime="t29" relType="BEFORE"/>
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
Times and Events
What are temporal expressions?
They refer to a time
Subtasks: recognition and interpretation; SotA recognition is0.86 F1
What do we consider as events?
Verbal, nominal
State of the art: 0.90 F1 for recognition
Doesn’t cover complex structure; e.g. a music festival
Events are not very useful unless related to other temporalentities
How can we describe this structural complexity?Start by modeling the document as a graph
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
Temporal relations
What are temporal relations?
They describe the links between times and events
Can capture both complex and partial orderings
What kinds of temporal relation are there?
1 Interval (before, after, included by, simultaneous)
2 Subordinate (reported speech, modal, conditional)
3 Aspectual (start, culmination – see Vendler, Comrie)
This work is concerned with the coarsest-grained information: thefirst category
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
Problem Definition
How are these relations represented?
Temporal interval algebra (Allen 1984) – a set of 14 relationsbetween a pair of intervals
TimeML defines a set of relation types and also types ofinterval
What is our problem?
Assume discourse w/ perfect event and timex annotations
In fact, assume we know which intervals to link!
“Given an ordered pair of intervals (arg1, arg2), which relation inthe set Rallen describes them?”
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
Relation Extraction
How can relations be labelled?
Machine learning
Using TimeML attributes: some success
Using syntactic relations: matches SotA in tree kernels
What’s the state of the art?
2007: Mani et al.: baseline 56%, system has 61% accuracy
2008: Bethard, Chambers: many sophisticated improvements– ILP, timex-timex ordering. Improved on Mani et al. by 1.5%.
2010: TempEval-2: baseline 58%, best was 65% accuracy
Why do we find this performance ceiling?
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
Sources of Temporal Relation Information
What are we missing?There is a heterogeneous set of temporal information types,including:
Explicit signals – subsequently, as soon as
Linguistic theory offers some models
What is the evidence these two types will help?
Conducted failure analysis: TempEval-2010 1
Multiple diverse approaches, same dataset
Find the set of difficult links
Characterise information supporting these links
1Verhagen et al., 2010: Semeval Task 13 - TempEval-2Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
Task C: event−timex intra−sentence relations
All systems correct 1 fails 2 fail 3 fail 4 fail 5 fail All systems fail
Task D: event−DCT relations
All systems correct 1 fails 2 fail 3 fail 4 fail All systems fail
Task E: main event inter−sentence relations
All systems correct 1 fails 2 fail 3 fail 4 fail 5 fail All systems fail
Task F: event−subordinate intra−sentence relations
All systems correct 1 fails 2 fail 3 fail 4 fail All systems fail
Figure: TempEval-2 relation labelling tasks, showing proportions ofrelations according to the number of systems that gave correct labels.
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
C D E F
Proportion of links within a task that are difficult
Task
% d
iffic
ult
010
2030
40
The problem is difficult, and there is a consistently-difficult set oflinks. Perhaps we are ignoring some critical information.
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
New sources of ordering information
Next step: manually characterise each “difficult” link.Attempt to identify what kind of information could be used tolabel it.
Sources to investigate
Explicit text – signals “After you pull the pin, throw the grenade”
Sources to investigate
Tensed relations “Having eaten, I left”
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
Temporal Signals
What are these?
In TimeML, they are text annotated as being helpful to atemporal relation
Used by 12.2% of TimeBank’s relations
Are temporal signals useful?
A resounding yes! 61% → 83% accuracy with simplefeatures 2
This level of performance on event-event links is abovegeneral state-of-the-art
Existing corpora are under-annotated
2Derczynski and Gaizauskas, 2010: Using signals for temporal relationclassification
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
Temporal Signal Annotation
How can we automatically annotate temporal signals?
Define signals formally 3
Define a closed class of signals
Re-annotate TimeBank
Train discrimination and association
We included dependency information and function tagging.
3Derczynski and Gaizauskas, 2011: A corpus based study of temporal signalsLeon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
Results
How well did our approach perform?
1 Discrimination: 92% accuracy, 75% accuracy on positives(0.77 IAA)
2 Association: 99% accuracy / 80% error reduction
3 Inductive bias towards independence assumption was harmful(MaxEnt, NBayes)
Results: 16% of links have signals (31% improvement) and cannow be labelled at high accuracy.What remains to be done?
How can we remedy under-annotation at the source?
Clear links to spatial signal annotation (e.g. -LOC tags)
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
Reichenbach’s Model of Verbs
How can we model tense in language?
Each verb happens at event time, E
The verb is uttered at speech time, S
Past tense: E < S John ran.
Present tense: E = S I’m free!
What differentiates simple past from past perfect?
John ran. is not the same as John had run.
Introduce abstract reference time, R
John had run. E < R < S
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
Reasoning about tense
How is Reichenbach’s model helpful?
We can describe all verbal events as three points linked byeither equality or precedence
Automatic and quick inference for relating intervals
Does it work?
Conducted first corpus-driven validation of the framework
For reporting-type links, we used features based on pairwiseevent-time relations
Add one feature representing the Reichenbachian ordering
Classifier reached 59% accuracy (48% MCC baseline) on 9%of all temporal relations (above SotA)
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
Extending the model
How else can we use the model?
Positional use
Timexes relate to reference points
Only consider cases where the event and time are linguisticallyconnected
Identify these using dependency parses
Add a feature hinting at the ordering
We reach 75% accuracy from a 67% baseline (above SotA)
Also useful for timex standard transduction 4
4Derczynski, Llorens and Saquete 2012: Massively increasing TIMEX3resources
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
Contributions
A large part of the difficult relation set (roughly 60%) is cateredfor by these new information sources.
Difficult task, with notable impact
Focus on automatic annotation of temporal relations
Pushed beyond SotA understanding of the problem
Creation of and contribution to language resources – e.g.ISO-TimeML, RTMML, CAVaT (among others)
.. where could we go next?
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
Future
Forensic analysisHow can we build a consistent event model from multiplesemi-reliable accounts of an event?
Challenges:
Multi-document event and actor co-reference
Story conflict resolution 5
Spatial and temporal IE from colloquial text
Building and resolving accurate co-constraining models fromunreliable data (belief networks)
5Regneri, Koller and Pinkal 2010: Learning Script Knowledge with WebExperiments
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
Future
Assertion boundingAll assertions have temporal bounds. How can we determine these?
Challenges:
Accurate extraction of document temporal structure
Automated reasoning
High-precision timex normalisation
Doing temporal IE & IR at gigaword scale
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
Future
Temporal dataset constructionMany current systems index whole documents by date, butinformation is more nuanced than that
Challenges:
Mapping events to temporal data points
Storing and extracting events
Anchoring events with uncertain bounds (“last year’s fighting”vs. “the fighting on April 23, 2011”)
Mining complex super-events; e.g. the Fukushima disaster;what happened when?
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
Recap
Temporality is ubiquitous, in the world around us and in thelanguage we use to describe our world
Processing it automatically is difficult
Doing high-performance temporal IE opens exciting researchavenues
Thank you for your time. Are there any questions?
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
Labellings as probability distributions
Automated methods (e.g. classifiers) may have varying degrees ofconfidence about a link’s label.We could assign a set of labels and probabilities to each label.Consistency constraints allow us to find the most-likely possiblegraph.
A:B → before: 0.9; after 0.1
B:C → before: 0.5; simultaneous: 0.5
A:C → before: 1.0
Very time-consuming to compute– optimisations welcome!
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse
Introduction Concepts and tools Relation Extraction Temporal Signals Modelling Tense Conclusion
Unuttered temporal orderings
Event/Time distance
“When I was brushing my teeth”→ This event happens at least twice daily; assume this instance is0-16 hours away
Complex events
“When we were putting up the tents for the festival”→ near the beginning of / just before the “festival” event
Leon Derczynski University of Sheffield
Determining the Types of Temporal Relations in Discourse