biomedical literature mining (and why we really need open access)

80
Biomedical literature mining (and why we really need Open Access) Lars Juhl Jensen EMBL Heidelberg

Upload: lars-juhl-jensen

Post on 25-May-2015

445 views

Category:

Technology


0 download

DESCRIPTION

The 28th IATUL annual conference: Global Access to Science - Scientific Publishing for the Future, Royal Institute of Technology (KTH), Stockholm, Sweden, June 11-14, 2007

TRANSCRIPT

Page 1: Biomedical literature mining (and why we really need open access)

Biomedical literature mining(and why we really need Open Access)

Lars Juhl JensenEMBL Heidelberg

Page 2: Biomedical literature mining (and why we really need open access)

why biomedicine?

Page 3: Biomedical literature mining (and why we really need open access)

why literature mining?

Page 4: Biomedical literature mining (and why we really need open access)

why open access?

Page 5: Biomedical literature mining (and why we really need open access)

MEDLINE

Page 6: Biomedical literature mining (and why we really need open access)

17 million citations

Page 7: Biomedical literature mining (and why we really need open access)

Jensen et al., Nature Reviews Genetics, 2006

Page 8: Biomedical literature mining (and why we really need open access)

too much to read

Page 9: Biomedical literature mining (and why we really need open access)

literature mining

Page 10: Biomedical literature mining (and why we really need open access)

open access

Page 11: Biomedical literature mining (and why we really need open access)

information retrieval

Page 12: Biomedical literature mining (and why we really need open access)

finding the papers

Page 13: Biomedical literature mining (and why we really need open access)

ad hoc retrieval

Page 14: Biomedical literature mining (and why we really need open access)
Page 15: Biomedical literature mining (and why we really need open access)

user-specified query

Page 16: Biomedical literature mining (and why we really need open access)

“yeast AND cell cycle”

Page 17: Biomedical literature mining (and why we really need open access)

stemming

Page 18: Biomedical literature mining (and why we really need open access)

yeast / yeasts

Page 19: Biomedical literature mining (and why we really need open access)

dynamic query expansion

Page 20: Biomedical literature mining (and why we really need open access)

yeast / S. cerevisiae

Page 21: Biomedical literature mining (and why we really need open access)
Page 22: Biomedical literature mining (and why we really need open access)

MEDLINE

Page 23: Biomedical literature mining (and why we really need open access)

abstracts

Page 24: Biomedical literature mining (and why we really need open access)

complete papers

Page 25: Biomedical literature mining (and why we really need open access)

Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1

hyperphosphorylation and degradation

Page 26: Biomedical literature mining (and why we really need open access)

yeast?

Page 27: Biomedical literature mining (and why we really need open access)

cell cycle?

Page 28: Biomedical literature mining (and why we really need open access)

entity recognition

Page 29: Biomedical literature mining (and why we really need open access)

identifying the substance(s)

Page 30: Biomedical literature mining (and why we really need open access)

Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1

hyperphosphorylation and degradation

Page 31: Biomedical literature mining (and why we really need open access)

Cdc28 yeast

Page 32: Biomedical literature mining (and why we really need open access)

Cdc28 cell cycle

Page 33: Biomedical literature mining (and why we really need open access)

good synonyms list

Page 34: Biomedical literature mining (and why we really need open access)

manual curation

Page 35: Biomedical literature mining (and why we really need open access)

orthographic variation

Page 36: Biomedical literature mining (and why we really need open access)

CDC28

Page 37: Biomedical literature mining (and why we really need open access)

Cdc28p

Page 38: Biomedical literature mining (and why we really need open access)

disambiguation

Page 39: Biomedical literature mining (and why we really need open access)

hairy

Page 40: Biomedical literature mining (and why we really need open access)

SDS

Page 41: Biomedical literature mining (and why we really need open access)

Cdc2

Page 42: Biomedical literature mining (and why we really need open access)
Page 43: Biomedical literature mining (and why we really need open access)
Page 44: Biomedical literature mining (and why we really need open access)

abstracts

Page 45: Biomedical literature mining (and why we really need open access)

complete papers

Page 46: Biomedical literature mining (and why we really need open access)

information extraction

Page 47: Biomedical literature mining (and why we really need open access)

formalizing the facts

Page 48: Biomedical literature mining (and why we really need open access)
Page 49: Biomedical literature mining (and why we really need open access)

co-mentioning

Page 50: Biomedical literature mining (and why we really need open access)

statistical methods

Page 51: Biomedical literature mining (and why we really need open access)

NLPNatural Language Processing

Page 52: Biomedical literature mining (and why we really need open access)

Gene and protein names

Cue words for entity recognition

Verbs for relation extraction

[nxexpr The expression of [nxgene the cytochrome genes [nxpg CYC1 and CYC7]]]is controlled by[nxpg HAP1]

Page 53: Biomedical literature mining (and why we really need open access)

Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1

hyperphosphorylation and degradation

Page 54: Biomedical literature mining (and why we really need open access)

Jensen et al., Nature Reviews Genetics, 2006

Page 55: Biomedical literature mining (and why we really need open access)

new discoveries

Page 56: Biomedical literature mining (and why we really need open access)

text mining

Page 57: Biomedical literature mining (and why we really need open access)
Page 58: Biomedical literature mining (and why we really need open access)

Jensen et al., Nature Reviews Genetics, 2006

Page 59: Biomedical literature mining (and why we really need open access)

abstracts

Page 60: Biomedical literature mining (and why we really need open access)

complete papers

Page 61: Biomedical literature mining (and why we really need open access)

temporal trends

Page 62: Biomedical literature mining (and why we really need open access)

Jensen et al., Nature Reviews Genetics, 2006

Page 63: Biomedical literature mining (and why we really need open access)

buzzwords

Page 64: Biomedical literature mining (and why we really need open access)

Jensen et al., Nature Reviews Genetics, 2006

Page 65: Biomedical literature mining (and why we really need open access)

grant applications

Page 66: Biomedical literature mining (and why we really need open access)

integration of text and data

Page 67: Biomedical literature mining (and why we really need open access)

Genomic neighborhood

Species co-occurrence

Gene fusions

Database imports

Experimental interaction data

Microarray expression data

Literature mining

Page 68: Biomedical literature mining (and why we really need open access)

genotype to phenotype

Page 69: Biomedical literature mining (and why we really need open access)

Korbel et al., PLoS Biology, 2005

Page 70: Biomedical literature mining (and why we really need open access)

Korbel et al., PLoS Biology, 2005

Page 71: Biomedical literature mining (and why we really need open access)

Korbel et al., PLoS Biology, 2005

Page 72: Biomedical literature mining (and why we really need open access)

where are we now?

Page 73: Biomedical literature mining (and why we really need open access)

Jensen et al., Nature Reviews Genetics, 2006

Page 74: Biomedical literature mining (and why we really need open access)

abstracts

Page 75: Biomedical literature mining (and why we really need open access)

complete papers

Page 76: Biomedical literature mining (and why we really need open access)

restricted access

Page 77: Biomedical literature mining (and why we really need open access)

open access

Page 78: Biomedical literature mining (and why we really need open access)

the tools are there

Page 79: Biomedical literature mining (and why we really need open access)

now we need the text!

Page 80: Biomedical literature mining (and why we really need open access)

Acknowledgments

Jasmin SaricRossitza Ouzounova

Michael KuhnJan Korbel

Tobias DoerksIsabel Rojas

Miguel AndradePeer Bork