joint posterior revision of nlp annotations via ontological ......eric clapton 3. general...

1
Tools: Gold Corpus (NERC): AIDA CoNLL-YAGO (train) Datasets: AIDA CoNLL-YAGO (test-b) MEANTIME TAC-KBP Research question: Does the posterior joint revision of the annotations from Stanford CoreNLP (NERC) and DBpedia Spotlight (EL), via YAGO, improve their performances? Measures: NERC / EL / NERC+EL Deterministically computable from the classes of the entity (possibly leveraging alignments between EL Knowledge Base and ) Joint Posterior Revision of NLP Annotations via Ontological Knowledge Marco Rospocher ([email protected]), Francesco Corcoglioniti ([email protected]) Mr. Washington was runner-up at Wimbledon in 1996. 1. Problem: Incoherent Mention-level NLP Annotations The GW Bridge is a suspension bridge over the Hudson Washington (state) George Washington Bridge 2. Solution: Coherence via Ontology Eric Clapton is one of the greatest guitar players. PER Eric Clapton 3. General Probabilistic Model Variables Conditional Independence Assumptions 4. Model Instantiated on NERC + EL Ontological Background Knowledge confidence score learned from data Estimating NERC Estimating EL 6,016,695 entities Taxonomy of 568,255 classes (only ingoing links) #4479 5. Evaluation ②③ M CP CP M * CP Prior (popularity based on entity ingoing links) Computed from gold corpus (NERC + class annotations) Consider only class sets restricted to popular classes (seen at least n * times in the gold corpus) Restricted to gold mentions. Similar improvements also considering all mentions, and macro-averaging by document or by NERC type. PER ORG NERC EL Named Entity Recognition and Classification Entity Linking

Upload: others

Post on 18-Mar-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Joint Posterior Revision of NLP Annotations via Ontological ......Eric Clapton 3. General Probabilistic Model Variables Conditional Independence Assumptions 4. Model Instantiated on

Tools:

Gold Corpus (NERC): AIDA CoNLL-YAGO (train)

Datasets: AIDA CoNLL-YAGO (test-b)MEANTIME TAC-KBP

Research question: Does the posterior joint revision of the annotations from Stanford CoreNLP (NERC) and DBpedia Spotlight (EL),via YAGO, improve their performances?

Measures: NERC / EL / NERC+EL

Deterministically computable from the classes of the entity (possibly leveraging alignments between EL Knowledge Base and )

Joint Posterior Revision of NLP Annotationsvia Ontological KnowledgeMarco Rospocher ([email protected]), Francesco Corcoglioniti ([email protected])

Mr. Washington was runner-up at Wimbledon in 1996.

1. Problem: Incoherent Mention-level NLP Annotations

The GW Bridge is a suspension bridge over the Hudson

Washington (state)

George Washington Bridge

2. Solution: Coherence via Ontology

Eric Clapton is one of the greatest guitar players.PER

Eric Clapton

3. General Probabilistic Model

Variables Conditional Independence Assumptions

4. Model Instantiated on NERC + EL

Ontological Background Knowledge

confidence score learned from data

Estimating NERC

Estimating EL

6,016,695 entitiesTaxonomy of 568,255 classes

(only ingoing links)

#4479

5. Evaluation

①②③

②③

M

CP

CP

M*

CP

Prior (popularity basedon entity ingoing links)

Computed from gold corpus(NERC + class annotations)

Consider only class sets restricted to popular classes(seen at least n* times in the gold corpus)

Restricted to gold mentions. Similar improvements also considering all mentions, and macro-averaging by document or by NERC type.

PER

ORG

NERC

EL

Named Entity Recognition and Classification

Entity Linking

①② ③