introduction to artificial intelligence massimo poesio relation extraction
TRANSCRIPT
INTRODUCTION TO ARTIFICIAL INTELLIGENCE
Massimo Poesio
Relation Extraction
SEMANTIC INTERPRETATION: FROM SENTENCES TO PROPOSITIONS
Powell met Zhu Rongji
Proposition: meet(Powell, Zhu Rongji)Powell met with Zhu Rongji
Powell and Zhu Rongji met
Powell and Zhu Rongji had a meeting
. . .
When Powell met Zhu Rongji on Thursday they discussed the return of the spy plane.
meet(Powell, Zhu) discuss([Powell, Zhu], return(X, plane))
debateconsult
joinwrestle
battle
meet(Somebody1, Somebody2)
OTHER ASPECTS OF SEMANTIC INTERPRETATION
• Identification of RELATIONS between entities mentioned– Focus of interest in modern CL since 1993 or so
• Identification of TEMPORAL RELATIONS – From about 2003 on
• QUALIFICATION of such relations (modality, epistemicity)– From about 2010 on
TYPES OF RELATIONS
• Predicate-argument structure (verbs and nouns)
• Nominal relations• Relations between events / temporal relations
PREDICATE-ARGUMENT STRUCTURE
• Linguistic Theories– Case Frames – Fillmore FrameNet– Lexical Conceptual Structure – Jackendoff LCS– Proto-Roles – Dowty PropBank– English verb classes (diathesis alternations) - Levin VerbNet– Talmy, Levin and Rappaport
Fillmore’s Case Theory• Sentences have a DEEP STRUCTURE with CASE
RELATIONS
• A sentence is a verb + one or more NPs– Each NP has a deep-structure case
• A(gentive)• I(nstrumental)• D(ative)• F(actitive)• L(ocative)• O(bjective)
– Subject is no more important than Object• Subject/Object are surface structure
THEMATIC ROLES
• Following on Fillmore’s original work, many theories of predicate argument structure / thematic roles were proposed, among which the best known perhaps– Jackendoff’s LEXICAL CONCEPTUAL SEMANTICS– Dowty’s PROTO-ROLES theory
Dowty’s PROTO-ROLES
• Event-dependent• Prototypes based on shared entailments• Grammatical relations such as subject related
to observed (empirical) classification of participants
• Typology of grammatical relations • Proto-Agent• Proto-Patient
Proto-Agent
• Properties – Volitional involvement in event or state– Sentience (and/or perception)– Causing an event or change of state in another
participant– Movement (relative to position of another
participant) – (exists independently of event named) *may be discourse pragmatic
Proto-Patient
• Properties:– Undergoes change of state– Incremental theme– Causally affected by another participant– Stationary relative to movement of another
participant– (does not exist independently of the event, or at
all) *may be discourse pragmatic
Semantic role labels:
Jan broke the LCD projector.
break (agent(Jan), patient(LCD-projector))
cause(agent(Jan), change-of-state(LCD-projector))
(broken(LCD-projector))
agent(A) -> intentional(A), sentient(A), causer(A), affector(A)
patient(P) -> affected(P), change(P),…
Filmore, 68
Jackendoff, 72
Dowty, 91
VERBNET AND PROPBANK
• Dowty’s theory of proto-roles was the basis for the development of PROPBANK, the first corpus annotated with information about predicate-argument structure
PROPBANK REPRESENTATION
a GM-Jaguar pact
that would give
*T*-1
the US car maker
an eventual 30% stake in the British company
Arg0
Arg2
Arg1
give(GM-J pact, US car maker, 30% stake)
a GM-Jaguar pact that would give the U.S. car maker an eventual 30% stake in the British company.
ARGUMENTS IN PROPBANK
• Arg0 = agent• Arg1 = direct object / theme / patient• Arg2 = indirect object / benefactive /
instrument / attribute / end state• Arg3 = start point / benefactive / instrument /
attribute• Arg4 = end point• Per word vs frame level – more general?
FROM PREDICATES TO FRAMES
In one of its senses, the verb observe evokes a frame called Compliance: this frame concerns people’s responses to norms, rules or practices.
The following sentences illustrate the use of the verb in the intended sense:– Our family observes the Jewish dietary laws.– You have to observe the rules or you’ll be penalized.– How do you observe Easter?– Please observe the illuminated signs.
FrameNet
FrameNet records information about English words in the general vocabulary in terms of
1. the frames (e.g. Compliance) that they evoke, 2. the frame elements (semantic roles) that make up the
components of the frames (in Compliance, Norm is one such frame element), and
3. each word’s valence possibilities, the ways in which information about the frames is provided in the linguistic structures connected to them (with observe, Norm is typically the direct object).
theta
NOMINAL RELATIONS
CLASSIFICATION SCHEMES FOR NOMINAL RELATIONS
ONE EXAMPLE (Barker et al1998, NASTASE & Spakowicz 2003)
THE TWO-LEVEL TAXONOMY OF RELATIONS, 2
THE SEMEVAL-2007 CLASSIFICATION OF RELATIONS
• Cause-Effect: laugh wrinkles • Instrument-Agency: laser printer • Product-Producer: honey bee • Origin-Entity: message from outer-space• Theme-Tool: news conference • Part-Whole: car door• Content-Container: the air in the jar
THE MUC AND ACE TASKS
• Modern research in relation extraction, as well, was kicked-off by the Message Understanding Conference (MUC) campaigns and continued through the Automatic Content Extraction (ACE) and Machine Reading follow-ups
• MUC: NE, coreference, TEMPLATE FILLING• ACE: NE, coreference, relations
TEMPLATE-FILLING
EXAMPLE MUC: JOB POSTING
THE ASSOCIATED TEMPLATE
AUTOMATIC CONTENT EXTRACTION (ACE)
ACE: THE DATA
ACE: THE TASKS
RELATION DETECTION AND RECOGNITION
ACE: RELATION TYPES
OTHER PRACTICAL VERSIONS OF RELATION EXTRACTION
• Biomedical domain (BIONLP, BioCreative)• Chemistry• Cultural Heritage
THE TASK OF SEMANTIC RELATION EXTRACTION
SEMANTIC RELATION EXTRACTION: THE CHALLENGES
HISTORY OF RELATION EXTRACTION
• Before 1993: Symbolic methods (using knowledge bases)
• Since then: statistical / heuristic based methods– From 1995 to around 2005: mostly SUPERVISED– More recently: also quite a lot of UNSUPERVISED /
SEMI SUPERVISED techniques
MORE COMPLEX SEMANTICS
• Modalities• Temporal interpretation
ACKNOWLEDGMENTS
• Many slides borrowed from – Roxana Girju – Alberto Lavelli