lars juhl jensen biomedical text mining. exponential growth
TRANSCRIPT
![Page 1: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/1.jpg)
Lars Juhl Jensen
Biomedical text mining
![Page 2: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/2.jpg)
exponential growth
![Page 3: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/3.jpg)
![Page 4: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/4.jpg)
![Page 5: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/5.jpg)
~45 seconds per paper
![Page 6: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/6.jpg)
information retrieval
![Page 7: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/7.jpg)
named entity recognition
![Page 8: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/8.jpg)
augmented browsing
![Page 9: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/9.jpg)
text corpora
![Page 10: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/10.jpg)
information extraction
![Page 11: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/11.jpg)
information retrieval
![Page 12: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/12.jpg)
find the relevant papers
![Page 13: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/13.jpg)
ad hoc retrieval
![Page 14: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/14.jpg)
user-specified query
![Page 15: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/15.jpg)
“yeast AND cell cycle”
![Page 16: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/16.jpg)
PubMed
![Page 17: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/17.jpg)
![Page 18: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/18.jpg)
indexing
![Page 19: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/19.jpg)
fast lookup
![Page 20: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/20.jpg)
stemming
![Page 21: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/21.jpg)
word endings
![Page 22: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/22.jpg)
dynamic query expansion
![Page 23: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/23.jpg)
MeSH terms
![Page 24: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/24.jpg)
Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1
and this modification served as a priming step to promote subsequent
Cdc5-dependent Swe1 hyperphosphorylation and degradation
![Page 25: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/25.jpg)
no tool will find that
![Page 26: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/26.jpg)
named entity recognition
![Page 27: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/27.jpg)
computer
![Page 28: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/28.jpg)
as smart as a dog
![Page 29: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/29.jpg)
teach it specific tricks
![Page 30: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/30.jpg)
![Page 31: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/31.jpg)
![Page 32: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/32.jpg)
identify the concepts
![Page 33: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/33.jpg)
Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1
and this modification served as a priming step to promote subsequent
Cdc5-dependent Swe1 hyperphosphorylation and degradation
![Page 34: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/34.jpg)
comprehensive lexicon
![Page 35: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/35.jpg)
proteins
![Page 36: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/36.jpg)
chemicals
![Page 37: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/37.jpg)
compartments
![Page 38: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/38.jpg)
tissues
![Page 39: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/39.jpg)
diseases
![Page 40: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/40.jpg)
organisms
![Page 41: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/41.jpg)
CDC2
![Page 42: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/42.jpg)
cyclin dependent kinase 1
![Page 43: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/43.jpg)
orthographic variation
![Page 44: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/44.jpg)
upper- and lower-case
![Page 45: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/45.jpg)
CDC2
![Page 46: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/46.jpg)
Cdc2
![Page 47: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/47.jpg)
spaces and hyphens
![Page 48: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/48.jpg)
cyclin dependent kinase 1
![Page 49: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/49.jpg)
cyclin-dependent kinase 1
![Page 50: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/50.jpg)
prefixes and postfixes
![Page 51: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/51.jpg)
CDC2
![Page 52: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/52.jpg)
hCDC2
![Page 53: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/53.jpg)
“black list”
![Page 54: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/54.jpg)
SDS
![Page 55: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/55.jpg)
scalable implementation
![Page 56: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/56.jpg)
text corpora
![Page 57: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/57.jpg)
>10 km<10 hours
![Page 58: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/58.jpg)
most use Medline
![Page 59: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/59.jpg)
~22 million abstracts
![Page 60: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/60.jpg)
few use full-text articles
![Page 61: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/61.jpg)
no access
![Page 62: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/62.jpg)
PDF files
![Page 63: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/63.jpg)
![Page 64: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/64.jpg)
layout-aware extraction
![Page 65: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/65.jpg)
millions of full-text articles
![Page 66: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/66.jpg)
information extraction
![Page 67: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/67.jpg)
formalize the facts
![Page 68: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/68.jpg)
Mitotic cyclin (Clb2)-bound Cdc28 (Cdk1 homolog) directly phosphorylated Swe1
and this modification served as a priming step to promote subsequent
Cdc5-dependent Swe1 hyperphosphorylation and degradation
![Page 69: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/69.jpg)
two approaches
![Page 70: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/70.jpg)
co-mentioning
![Page 71: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/71.jpg)
counting
![Page 72: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/72.jpg)
within documents
![Page 73: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/73.jpg)
within paragraphs
![Page 74: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/74.jpg)
within sentences
![Page 75: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/75.jpg)
co-mentioning score
![Page 76: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/76.jpg)
NLPNatural Language Processing
![Page 77: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/77.jpg)
grammatical analysis
![Page 78: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/78.jpg)
part-of-speech tagging
![Page 79: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/79.jpg)
multiword detection
![Page 80: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/80.jpg)
semantic tagging
![Page 81: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/81.jpg)
sentence parsing
![Page 82: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/82.jpg)
Gene and protein namesCue words for entity recognitionVerbs for relation extraction
[nxexpr The expression of [nxgene the cytochrome genes [nxpg CYC1 and CYC7]]]is controlled by[nxpg HAP1]
![Page 83: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/83.jpg)
extract stated facts
![Page 84: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/84.jpg)
high precision
![Page 85: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/85.jpg)
poor recall
![Page 86: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/86.jpg)
ExerciseGo to http://diseases.jensenlab.org
Find TYMS disease associations
Inspect the text-mining evidence
Look for examples of synonym usage
Find genes linked to colorectal cancer
![Page 87: Lars Juhl Jensen Biomedical text mining. exponential growth](https://reader036.vdocuments.site/reader036/viewer/2022062519/56649ed35503460f94be2e8f/html5/thumbnails/87.jpg)
thank you!