citances and what should our ui look like? marti hearst sims, uc berkeley supported by nsf...
Post on 22-Dec-2015
217 views
TRANSCRIPT
Citances andWhat should our UI look
like?
Marti HearstSIMS, UC Berkeley
http://biotext.berkeley.eduSupported by NSF DBI-0317510 and a gift from Genentech
Each of these in turn are cited for some fact(s) …
… until it is the case that all important facts in the field can be found in citationsentences alone!
Citances Nearly every statement in a bioscience journal article
is backed up with a cite. It is quite common for papers to be cited 30-100
times. The text around the citation tends to state biological
facts. (Call these citances.)
Different citances will state the same facts in different ways …
… so can we use these for creating models of language expressing semantic relations?
Using Citances Potential uses of citation sentences
(citances) creation of training and testing data for
semantic analysis, synonym set creation, database curation, document summarization, and information retrieval generally.
Issues for Processing Citances
Text span Identification of the appropriate phrase, clause,
or sentence that constructs a citance. Correct mapping of citations when shown as lists
or groups (e.g., “[22-25]”). Grouping citances by topic
Citances that cite the same document should be grouped by the facts they state.
Normalizing or paraphrasing citances For IR, summarization, learning synonyms,
relation extraction, question answering, and machine translation.
Citances:
Some preliminary results: Citances to a document align well
with a hand-built curation. Citances are good candidates for
paraphrase creation.
Paraphrase Creation Algorithm1. Extract the sentences that cite the target.
2. Mark the NEs of interest (genes/proteins, MeSH terms)
and normalize.3. Dependency parse (MiniPar).4. For each parse
For each pair of NEs of interesti. Extract the path between them.ii. Create a paraphrase from the path.
5. Rank the candidates for a given pair of NEs.6. Select only the ones above a threshold.7. Generalize.
Creating a Paraphrase
Given the path from the dependency parse:Restore the original word order. Add words to improve grammaticality.
• Bim … shown … be … following nerve growth factor withdrawal.
• Bim [has] [been] shown [to] be [upregulated] following nerve growth factor withdrawal.
Sample Sentences NGF withdrawal from sympathetic neurons
induces Bim, which then contributes to death.
Nerve growth factor withdrawal induces the expression of Bim and mediates Bax dependent cytochrome c release and apoptosis.
The proapoptotic Bcl-2 family member Bim is strongly induced in sympathetic neurons in response to NGF withdrawal.
In neurons, the BH3 only Bcl2 member, Bim, and JNK are both implicated in apoptosis caused by nerve growth factor deprivation.
Their Paraphrases NGF withdrawal induces Bim. Nerve growth factor withdrawal induces the
expression of Bim. Bim has been shown to be upregulated
following nerve growth factor withdrawal. Bim implicated in apoptosis caused by
nerve growth factor deprivation.
They all paraphrase: Bim is induced after NGF withdrawal.