nowomics at cambridge open research
TRANSCRIPT
Biomedical data are being generated and published at an unprecedented rate
How do I keep track of it?
0
300,000
600,000
900,000
1,200,000
2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013
New papers added to PubMed each year
~20,000 new abstracts added per week
interactionsdiseasesmutations
model organisms
genome annotation
proteins
literaturepathways
gene expression
interactionsdiseasesmutations
model organisms
genome annotation
proteins
literaturepathways
gene expression
1500biological databases*
* Nucleic Acids Research 2014 Database Issue
Fetch data every day
nowomics
Work out what’s changed
Organise by gene, disease,
process, author, etc
Personalised News Feed & email alerts
FollowUsers follow what
they work on
literature & databases link to original
data source
1. Publications 2. Experimental data
• gene expression shows gene is active in the liver • yeast 2-hybrid screen shows two proteins interact
3. Curated annotation experts read papers and extract important conclusions
• a paper reports a gene is related to Parkinson’s disease
DIFFERENT TO PAPER ALERTS?•Specific to biology, built into database
•Othologs & related genes
•Data types beyond papers
•Synonyms & dictionaries to identify genes/ terms • CHEK1, CHK1, CHK-1, checkpoint kinase 1, cell
cycle checkpoint kinase
Genes in Abstracts - 2013
other
GLUT-1
GLUTSLC2A1
GLUT1
SLC2A1 - 1938 mentions
otherERBB2
ERBB-2
HER-2
HER2
ERBB2 - 10836 mentions
GENE ONTOLOGY• Hierarchy of 40,000 terms • biological process, molecular function, cellular component • Curated and automatic links from terms to genes
Also alerted to child term annotation
Gene Ontology Assignments 2013
0
7500
15000
22500
30000
Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Human Mouse Rat Fly
NCBI GeneRIFs• Gene Reference into Function • Concise phrase describing function (425 characters) • Curated by NCBI or submitted by users
GeneRIFs/Gene 2013
0
1500
3000
4500
6000
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Human Mouse Rat Fly
Text mined genes in abstracts 2013
0
10000
20000
30000
40000
50000
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Human Mouse Rat Fly
DATA SHARING
Creating data/ publishing
Looking for information
AIM: enable interchange of relevant information
In news feed original source of data matters less
STORY SO FAR…
• biological data integration • query system • used by several model organism databases
keeping track of new data is hard…
• Revenue from antibody & reagent listings
• Subscription fee for companies
Nowomics Ltd
Why a company instead of academia?
Good for users• functionality • usability & design • maintained, updated •algorithm/ technology
Academic project
aim to please users
software tools in academia
Good for users•functionality •usability & design •maintained, updated • algorithm/ technology
Academic project Publish
& move on
aim to please users
reward structure
convince grant fundersMoney
Exception: NCBI, EBI, model organism databases
Company
Money
aim to please users
reward structure
Good for users• functionality • usability & design • maintained, updated •algorithm/technology
stay in business
COMING SOON• Search & filter news feed • Favourites and recommendations • Portals for Drosophila and Arabidopsis • Identifying trends •More data sources: Figshare, Arxiv, gene expression, SNPs,
pathways
Workshop - Monday 31st March, Dept of Geneticssee blog.nowomics.com