open access – making the most of biomedical literature mining

105
Open access – making the most of biomedical literature mining Lars Juhl Jensen EMBL Heidelberg

Upload: tave

Post on 13-Jan-2016

24 views

Category:

Documents


0 download

DESCRIPTION

Open access – making the most of biomedical literature mining. Lars Juhl Jensen EMBL Heidelberg. why open access?. why biomedicine?. why literature mining?. M EDLINE. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Open access – making the most of biomedical literature mining

Open access – making the most of biomedical literature mining

Lars Juhl JensenEMBL Heidelberg

Page 2: Open access – making the most of biomedical literature mining

why open access?

Page 3: Open access – making the most of biomedical literature mining

why biomedicine?

Page 4: Open access – making the most of biomedical literature mining

why literature mining?

Page 5: Open access – making the most of biomedical literature mining

MEDLINE

Page 6: Open access – making the most of biomedical literature mining
Page 7: Open access – making the most of biomedical literature mining

Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1

hyperphosphorylation and degradation

Page 8: Open access – making the most of biomedical literature mining

information retrieval

Page 9: Open access – making the most of biomedical literature mining

finding the papers

Page 10: Open access – making the most of biomedical literature mining

if you can’t find them …

Page 11: Open access – making the most of biomedical literature mining

… they don’t exist!

Page 12: Open access – making the most of biomedical literature mining

ad hoc retrieval

Page 13: Open access – making the most of biomedical literature mining

users-specified query

Page 14: Open access – making the most of biomedical literature mining

“yeast AND cell cycle”

Page 15: Open access – making the most of biomedical literature mining

Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1

hyperphosphorylation and degradation

Page 16: Open access – making the most of biomedical literature mining
Page 17: Open access – making the most of biomedical literature mining
Page 18: Open access – making the most of biomedical literature mining

MEDLINE

Page 19: Open access – making the most of biomedical literature mining

abstracts

Page 20: Open access – making the most of biomedical literature mining

complete papers

Page 21: Open access – making the most of biomedical literature mining
Page 22: Open access – making the most of biomedical literature mining
Page 23: Open access – making the most of biomedical literature mining

tricks

Page 24: Open access – making the most of biomedical literature mining

stemming

Page 25: Open access – making the most of biomedical literature mining

yeast / yeasts

Page 26: Open access – making the most of biomedical literature mining

synonyms

Page 27: Open access – making the most of biomedical literature mining

yeast / S. cerevisiae

Page 28: Open access – making the most of biomedical literature mining

dynamic query expansion

Page 29: Open access – making the most of biomedical literature mining

next logical step

Page 30: Open access – making the most of biomedical literature mining

ontologies

Page 31: Open access – making the most of biomedical literature mining

annotation

Page 32: Open access – making the most of biomedical literature mining

Cdc28 yeast gene

Page 33: Open access – making the most of biomedical literature mining

Cdc28 cell cycle

Page 34: Open access – making the most of biomedical literature mining

Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1

hyperphosphorylation and degradation

Page 35: Open access – making the most of biomedical literature mining

“yeast AND cell cycle”

Page 36: Open access – making the most of biomedical literature mining

entity recognition

Page 37: Open access – making the most of biomedical literature mining

identifying the substance(s)

Page 38: Open access – making the most of biomedical literature mining

Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1

hyperphosphorylation and degradation

Page 39: Open access – making the most of biomedical literature mining

if you can’t find them …

Page 40: Open access – making the most of biomedical literature mining

… they don’t exist!

Page 41: Open access – making the most of biomedical literature mining
Page 42: Open access – making the most of biomedical literature mining
Page 43: Open access – making the most of biomedical literature mining
Page 44: Open access – making the most of biomedical literature mining

abstracts

Page 45: Open access – making the most of biomedical literature mining

MEDLINE

Page 46: Open access – making the most of biomedical literature mining

tricks

Page 47: Open access – making the most of biomedical literature mining

good synonyms list

Page 48: Open access – making the most of biomedical literature mining

manual curation

Page 49: Open access – making the most of biomedical literature mining

orthographic variation

Page 50: Open access – making the most of biomedical literature mining

CDC28

Page 51: Open access – making the most of biomedical literature mining

Cdc28p

Page 52: Open access – making the most of biomedical literature mining

disambiguation

Page 53: Open access – making the most of biomedical literature mining

hairy

Page 54: Open access – making the most of biomedical literature mining

SDS

Page 55: Open access – making the most of biomedical literature mining

Cdc2

Page 56: Open access – making the most of biomedical literature mining

information extraction

Page 57: Open access – making the most of biomedical literature mining

formalizing the facts

Page 58: Open access – making the most of biomedical literature mining

co-mentioning

Page 59: Open access – making the most of biomedical literature mining

Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1

hyperphosphorylation and degradation

Page 60: Open access – making the most of biomedical literature mining
Page 61: Open access – making the most of biomedical literature mining

NLPNatural Language Processing

Page 62: Open access – making the most of biomedical literature mining

Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1 and this modification served as a priming step to promote subsequent Cdc5-dependent Swe1

hyperphosphorylation and degradation

Page 63: Open access – making the most of biomedical literature mining

Gene and protein names

Cue words for entity recognition

Verbs for relation extraction

[nxexpr The expression of [nxgene the cytochrome genes [nxpg CYC1 and CYC7]]]is controlled by[nxpg HAP1]

Page 64: Open access – making the most of biomedical literature mining
Page 65: Open access – making the most of biomedical literature mining

new discoveries

Page 66: Open access – making the most of biomedical literature mining

text mining

Page 67: Open access – making the most of biomedical literature mining
Page 68: Open access – making the most of biomedical literature mining
Page 69: Open access – making the most of biomedical literature mining

temporal trends

Page 70: Open access – making the most of biomedical literature mining
Page 71: Open access – making the most of biomedical literature mining

buzzwords

Page 72: Open access – making the most of biomedical literature mining
Page 73: Open access – making the most of biomedical literature mining

grant applications

Page 74: Open access – making the most of biomedical literature mining

global correlations

Page 75: Open access – making the most of biomedical literature mining

3279 83

3592

Regulates Regulated

P < 910-9

Page 76: Open access – making the most of biomedical literature mining

transcriptional networks

Page 77: Open access – making the most of biomedical literature mining

1127 44

3704

Phosphorylates Phosphorylated

P < 210-7

Page 78: Open access – making the most of biomedical literature mining

signal cascades

Page 79: Open access – making the most of biomedical literature mining

8107 47

3625

Expression Phosphorylation

P < 510-4

Page 80: Open access – making the most of biomedical literature mining
Page 81: Open access – making the most of biomedical literature mining

integration of text and data

Page 82: Open access – making the most of biomedical literature mining
Page 83: Open access – making the most of biomedical literature mining
Page 84: Open access – making the most of biomedical literature mining
Page 85: Open access – making the most of biomedical literature mining
Page 86: Open access – making the most of biomedical literature mining

network mining

Page 87: Open access – making the most of biomedical literature mining

linking genes to diseases

Page 88: Open access – making the most of biomedical literature mining
Page 89: Open access – making the most of biomedical literature mining
Page 90: Open access – making the most of biomedical literature mining
Page 91: Open access – making the most of biomedical literature mining
Page 92: Open access – making the most of biomedical literature mining

multifactorial diseases

Page 93: Open access – making the most of biomedical literature mining

genotype to phenotype

Page 94: Open access – making the most of biomedical literature mining
Page 95: Open access – making the most of biomedical literature mining
Page 96: Open access – making the most of biomedical literature mining
Page 97: Open access – making the most of biomedical literature mining

where are we now?

Page 98: Open access – making the most of biomedical literature mining
Page 99: Open access – making the most of biomedical literature mining

abstracts

Page 100: Open access – making the most of biomedical literature mining

complete papers

Page 101: Open access – making the most of biomedical literature mining

restricted access

Page 102: Open access – making the most of biomedical literature mining

open access

Page 103: Open access – making the most of biomedical literature mining

the tools are there

Page 104: Open access – making the most of biomedical literature mining

now we need the text!

Page 105: Open access – making the most of biomedical literature mining

Acknowledgments

Jasmin Saric

Rossitza Ouzounova

Michael Kuhn

Isabel Rojas

Miguel Andrade

Peer Bork