moving beyond free text

22
Moving beyond free text

Upload: adolfo

Post on 22-Feb-2016

39 views

Category:

Documents


0 download

DESCRIPTION

Moving beyond free text. Authors. Moving beyond free text. Old Paradigm:. Scientist does research. Scientist publishes research results in journal article. Want: All genes involved in seed development (name, species, protein sequence). Read 3,404 articles???. Read 592,000 articles???. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Moving beyond free text

Moving beyond free text

Page 2: Moving beyond free text

Moving beyond free textAuthors

Page 3: Moving beyond free text

Scientist does research

Scientist publishes research results in journal article

Old Paradigm:

Page 4: Moving beyond free text

Want:

All genes involved in seed development(name, species, protein sequence)

Page 5: Moving beyond free text

Read 3,404 articles???

Page 6: Moving beyond free text

Read 592,000 articles???

Page 7: Moving beyond free text

Results extracted from free text and converted to a structured format (ontology annotations)

Structured data combined with other data for queries, further analysis

manual curation (+ NLP…?)

Scientist does research

Scientist publishes research results as free text

Database

Old Paradigm - extended:

Page 8: Moving beyond free text

Example –Journal article about gene function

Page 9: Moving beyond free text

The goal: an annotation that captures the result

Example –Journal article about gene function

Page 10: Moving beyond free text

Manual curation:Time consuming, does not scale well

NLP:Very challenging

The goal: an annotation that captures the result

Example –Journal article about gene function

Page 11: Moving beyond free text

Example – phylogenetic treatment

http://www.mobot.org/mobot/research/apweb/welcome.html

Relatively high degree of structure compared to journal article

May be more amenable to natural language processing but still very challenging, complex information

Page 12: Moving beyond free text

Results extracted from free text and converted to a structured format (ontology annotations)

Structured data combined with other data for queries, further analysis

manual curation (+ NLP)Can we get authors involved?

Scientist does research

Scientist publishes research results as free text

Database

Page 13: Moving beyond free text

Link to external resource

Scientific Publishers are interested in this problem…

Page 14: Moving beyond free text

Science Direct: http://www.sciencedirect.com/science/article/pii/S0378111910001502

Scientific Publishers are interested in this problem…

Page 15: Moving beyond free text

Scientific Publishers are interested in this problem…

Page 16: Moving beyond free text

Databases are interested in this problem…

Page 17: Moving beyond free text

Databases are interested in this problem…

Page 18: Moving beyond free text

What if we had a good general tool for authors to do this themselves?

Page 19: Moving beyond free text

http://herbarium.usu.edu/webmanual/

Example: Morphological description of species

Page 20: Moving beyond free text

http://herbarium.usu.edu/webmanual/

Example: Morphological description of species

Page 21: Moving beyond free text

PO:0025034 (leaf), PATO:0000599 (decreased width)

PO:0020003 (ovule), PATO:0000460 (abnormal)

PO:0009010 (seed), PATO:0001997 (reduced)

Example: Mutant phenotype description

Page 22: Moving beyond free text

Scientist does research

Scientist publishes research results as free textand as annotations using ontology terms

Benefit to scientist – wider exposure and reuse of results

Benefit to publishers – tagged text allows enhanced presentation for subscribers

Benefit to research community – Better access to data

New Paradigm: