biomedical data everywhere: recent developments in data management and policy at nih

23
BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH Jerry Sheehan Assistant Director for Policy Development National Library of Medicine - National Institutes of Health [email protected] CASC Fall Meeting September 8, 2011, Arlington, VA

Upload: ion

Post on 12-Jan-2016

36 views

Category:

Documents


0 download

DESCRIPTION

BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH. Jerry Sheehan Assistant Director for Policy Development National Library of Medicine - National Institutes of Health [email protected] CASC Fall Meeting September 8, 2011, Arlington, VA. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

BioMedical Data Everywhere:Recent Developments in Data Management and Policy at NIH

Jerry SheehanAssistant Director for Policy DevelopmentNational Library of Medicine - National Institutes of [email protected]

CASC Fall MeetingSeptember 8, 2011, Arlington, VA

Page 2: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

National Library of Medicine: More than a Library • World’s largest medical library

– >12 million physical artifacts (books, journals, technical reports, photographs)

– >22,000 print and electronic serial subscriptions– Historical collection of rare and old medical works

• Intramural research laboratories– Lister Hill Nat’l Center for Biomedical Comms.– National Center for Biotechnology Information

• Extramural research and training– ~ 100 research projects per year, $36M– 18 funded research training sites, 250 trainees

• Health data standards and vocabularies• Information resources and services

– Publications and metadata– Genomic, chemical, clinical trial data– Environmental health and toxicology data– Disaster information services & systems– Medical images, analytical tools

2

www.nlm.nih.gov

Page 3: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

NLM Information Resources• Publications

– Citations/metadata (PubMed)– Full-text articles (PubMed

Central)• Data

– Genomic (GenBank, dbGaP, GEO, GeneTest)

– Clinical trials (ClinicalTrials.gov)– Drug (RxNorm, Daily Med, Pillbox)– Chemical (PubChem)– Environmental & toxicology

• Images– Visible Human– Spine x-rays, cervical images– Historical photos

• Synthesized information– Evidence summaries– Guidelines– Consumer health information

(MedlinePlus)• Vocabulary resources

– Unified Medical Language System– Standard clinical terms (SNOMED)– Health data interchange – Biomedical terms

• Software & Tools– APIs– Natural language processing– Image analysis– Mobile apps

3

Page 4: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

4

Page 5: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

http://www.pubmed.gov

QUALITY

Growth in Medline, the fully indexed subset of PubMed which accounts for approximately 90% of all PubMed citations. Original graph: http://www.nlm.nih.gov/bsd/stats/cit_added.html

PubMed/Medline: Journal CitationsCONTENT• 21+ million citations

and abstracts– 700,000 added per year – 50%+ link to full text

• 5500+ journals– 120-130 added per year

USAGE (2010)• 120+ million visitors• 2 million searches per

day• 2.4 billion page views• Google, Bing, others • Content used by

outside developers• Mobile version

5

Page 6: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

+ 2.2 million full-text articles,26 thousand more added per month

Typical weekday usage:•420,000 different users •740,000 articles retrieved Annually•~ 99% of articles downloaded at least once•28% downloaded more than 100 times

PubMed Central: Full-Text Articleswww.pubmedcentral.gov

6

Page 7: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

ClincalTrials.gov http://clinicaltrials.gov/

Studies Registered at ClinicalTrials.gov since May 1, 2005Registry and Results Database•Federally and privately supported trials •Conducted in the United States and 170+ countries•Mandatory submission for some trials

Current content •100,000+ registered trials•330 new registrations/week•3,000+ results (summary) of approved productso Outcome measureso Statistical analyseso Adverse events

Usage (2010)•28,000 visitors per day

7

Page 8: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

08-SEP-2011 CASC Fall Meeting 8

Page 9: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

9

Repository for NIH-funded GWA studiesAs of Aug 2011: •161 studies•2045 data sets•2727 documents•5890 Analyses•128190 Variables

Page 10: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

10

As of August, 2011: •85 million deposited substance records

o Representing more than 30 million chemically unique compounds•500 thousand bioassay records

o Representing more than 130 million experimental bioactivity results

• Database of biological activities of small molecules

• Repository for data from NIH Molecular Libraries program

Page 11: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

08-SEP-2011 CASC Fall Meeting 11

Page 12: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

ToxMap: Environmental Health Maps

12

Page 13: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

Almost 900 In English & Spanish

~ 40,000 links

Almost 900 In English & Spanish

~ 40,000 links

~1,000 drugs100 supplements~1,000 drugs100 supplements

> 170 tutorials> 75 anatomy videos> 125 surgery videos

> 170 tutorials> 75 anatomy videos> 125 surgery videos

Since 2006English & bilingual issues

Since 2006English & bilingual issues

>40 languages>250 topics>3,300 links

>40 languages>250 topics>3,300 links

Over 100 directories of doctors, hospitals, clinics & libraries

Over 100 directories of doctors, hospitals, clinics & libraries

~ 3,500 articles> 2,000 images~ 3,500 articles> 2,000 images

15-20 stories added daily15-20 stories added daily

>1,200 links to ClinicalTrials.gov>1,200 links to ClinicalTrials.gov

13

Page 14: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

MEDLINEPLUS CONNECTLinks from diagnosis, drug, and laboratory information in EHR/PHR to relevant material in MedlinePlus,  

MEDLINEPLUS MOBILEStreamlines content specifically tailored for users particular type of cell phone or tablet.

179K

306K

MEDLINEPLUS USAGE150 million visitors in 2010420,000 visitors per day.

MedlinePlus: Trusted Health Informationwww.medlineplus.gov

906K

2.3M

25.8M

436K

208K

128K

109K

507K

1.4M 296K

6.1M

1.5M

120K

174K

403K

656K

623K

1.5M

462K

1.6M

3.5M

1.2M

343K

765K

322K

1.8M

1M 2.4M

3.2M

5.4M

298K ME 270K NH 240K VT 2.2M MA 307K RI 834K CT 4.1M NJ 117K DE 1.7M MD 210K

10M 651K

1.9M

711K 1.3M

725K 3.1M

4.2M

Map of 100+ Million visits in the United States in 2010

14

Page 15: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

08-SEP-2011 15

Genetic test means an analysis of human DNA, RNA, chromosomes, proteins, or metabolites, if the analysis detects genotypes, mutations, or chromosomal changes. Genetic test does not include an analysis of proteins or metabolites that is directly related to a manifested disease, disorder, or pathological condition.

Page 16: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

08-SEP-2011 16CASC Fall Meeting

Page 17: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

NLM is Not Alone:Growing interest in data at NIH

“[High throughput technologies] provide us with the opportunity to ask questions that have the word ‘ALL’ in them. What are ALL the transcripts in a cell? What are ALL the protein interactions? . . Those kinds of questions are now approachable, especially if we do the right job of making really powerful databases publicly accessible to all those who need them and empower investigators in small labs as well as big labs to plunge into that kind of mindset.”- Francis S. Collins, MD, PhD [Director, NIH]

17

Page 18: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

08-SEP-2011 18

http://report.nih.gov/UploadDocs/Biomed_Info_Resources_FY08_09.pdf

http://report.nih.gov/biennialreport/

Page 19: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

08-SEP-2011 19

http://report.nih.gov/UploadDocs/Biomed_Info_Resources_FY08_09.pdf

Page 20: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

Select NIH Data Initiatives • NDAR – National Database for Autism Research (NIMH)

– Repository for NIH-funded autism studies and centers of excellence– Genomic, phenotypic, imaging data and associated information

• ADNI – Alzheimer’s Disease Neuroimaging Initiative (NIA)– Multisite study, public-private partership, validated biomarkers– Centralized FMRI and PET data, linked clinical database

• NIDDK Data Repository– Archival datasets from NIDDK-funded studies (diabetes, digestive, kidney)– 29 datasets to-date; more than 100 access requests in 2009-10

• BTRIS – Biomedical Translational Research Information System (CC)– Repository for data from NIH intramural clinical studies– Allow aggregation and analysis across multiple Institute studies

20

Page 21: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

Data Sharing Policies

21

NIH Public Access Policy (journal articles)

NIH Data Sharing Policy (data sharing plan)

NIH GWAS Policy

dbGaP

Clinical Trials Info

Clinical Trials.gov

IC or domain-specific policies

• Autism Research – National Database for Autism Research

• NIAAA Genetics of Alzheimer’s

• Alzheimer’s Disease Neuroimaging Initiative (LONI

Repository) • Others. . .

NIH Sequence

Data Sharing Policy

GenBankGEO

Page 22: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

Recent Guidance for NIH Data Sharing Plans

22

http://grants.nih.gov/grants/sharing_key_elements_data_sharing_plan.pdf

Page 23: BioMedical Data Everywhere: Recent Developments in Data Management and Policy at NIH

NLM 175th Anniversary

08-SEP-2011 23