opportunities and challenges for international cooperation around big data

26
Opportunities and Challenges for International Collaboration Around Big Data Philip E. Bourne, PhD Associate Director for Data Science National Institutes of Health [email protected] November 12, 2014

Upload: philip-bourne

Post on 25-Jun-2015

330 views

Category:

Education


1 download

DESCRIPTION

Presented at the Open Session of the Big Data in Biomedicine conference, Barcelona, November 12, 2014.

TRANSCRIPT

Page 1: Opportunities and Challenges for International Cooperation Around Big Data

Opportunities and Challenges for International Collaboration Around

Big Data

Philip E. Bourne, PhDAssociate Director for Data Science

National Institutes of [email protected]

November 12, 2014

Page 2: Opportunities and Challenges for International Cooperation Around Big Data

A Bottom Up Exemplar

Page 3: Opportunities and Challenges for International Cooperation Around Big Data

Top Down

Protein sequence and functional annotation

Protein sequence and functional annotation

Gene Ontology annotationGene Ontology annotation

Pathway and reactionannotationPathway and reactionannotation

Protein interactionannotationProtein interactionannotation

Evidence-based proteomicsannotationEvidence-based proteomicsannotation

Cellular modelsCellular models

Variants AnnotationVariants Annotation ClinVar / OMIMMedThesaurus

[adapted from Ioannis Xenarios

Page 4: Opportunities and Challenges for International Cooperation Around Big Data

What Else Can we Do from the Top Down?

Page 5: Opportunities and Challenges for International Cooperation Around Big Data

The NIH Data Science Mission

Statement

To foster an ecosystem that enables biomedical* research to be

conducted as a digital enterprise that enhances health, lengthens life and

reduces illness and disability

* Includes biological, biomedical, behavioral, social, environmental, and clinical studies that relate to understanding health and disease.

Page 6: Opportunities and Challenges for International Cooperation Around Big Data
Page 7: Opportunities and Challenges for International Cooperation Around Big Data

Elements of The Ecosystem

Community Policy

Infrastructure

• Sustainability• Collaboration• Training

Page 8: Opportunities and Challenges for International Cooperation Around Big Data

Elements of The Ecosystem

Community Policy

Infrastructure

• Sustainability Collaboration

• Training

VirtuousResearch

Cycle

Page 9: Opportunities and Challenges for International Cooperation Around Big Data

The Virtuous Cycle

September 3, 2014 Workshophttp://goo.gl/fkWjhS

Page 10: Opportunities and Challenges for International Cooperation Around Big Data

Policies – Now & Forthcoming

Data Sharing– Genomic data sharing announced

– Data sharing plans on all research awards

– Data sharing plan enforcement

• Machine readable plan

• Repository requirements to include grant numbers

http://www.nih.gov/news/health/aug2014/od-27.htm

Page 11: Opportunities and Challenges for International Cooperation Around Big Data

Policies - Forthcoming

Data Citation– Goal: legitimize data as a form of scholarship

– Process:

• Machine readable standard for data citation (done)

• Endorsement of data citation for inclusion in NIH bib sketch, grants, reports, etc.

• Example formats for human readable data citations

• Slowly work into NLM/NCBI workflow

Page 12: Opportunities and Challenges for International Cooperation Around Big Data

BD2KCenter

BD2KCenter

BD2KCenter

BD2KCenter

BD2KCenter

BD2KCenter

DDICC

Software

Standards

Infrastructure - The Commons

Labs

Labs

Labs

Labs

Page 13: Opportunities and Challenges for International Cooperation Around Big Data

What is the Commons?

A Conceptual Framework for;

Sharing, finding, integrating, reusing and attributing digital research objects

– “Each digital object has a UID that must allow it to

be found, shared and attributed” – The Commons

Document

The Commons is agnostic of computing platform

Page 14: Opportunities and Challenges for International Cooperation Around Big Data

The Commons: Framework Implementation

Digital Objects (with UIDs)

Search(indexed metadata)

Computing Platform

The

Com

mon

s

Page 15: Opportunities and Challenges for International Cooperation Around Big Data

The Commons: Framework Implementation

Digital Objects (with UIDs)

Search(indexed metadata)

Computing Platform

The

Com

mon

s

Page 16: Opportunities and Challenges for International Cooperation Around Big Data

The Commons: Framework Draft Implementations

The CommonsConceptual Framework

Public CloudPlatforms

Super Computing (HPC) Platforms

Other Platforms ?

Google, AWS (Amazon)

Microsoft (Azure), IBM,

other?

Most easily accessed by

NIH PIs

In house compute

solutions

Private clouds, HPC

– Pharma

– The Broad

– Bionimbus

Low access by NIH PIs

Super Computing 2014

ADDS coordinating

meeting with SC centers

NERSC “Commons Pilot”

Page 17: Opportunities and Challenges for International Cooperation Around Big Data

The Commons: Framework Implementation

Digital Objects (with UIDs)

Search(indexed metadata)

Computing Platform

The

Com

mon

s

Page 18: Opportunities and Challenges for International Cooperation Around Big Data

The Commons: Framework Draft Implementation

Digital Objects to populate and test the Commons;

– BD2K centers, NCI Cloud pilots (Google & AWS supported)

– Large Public Data Sets, MODs

Search

– BD2K Data and Software Discovery Indices

– Google Search functions

Use cases

The CommonsConceptual Framework

Public CloudPlatforms

Page 19: Opportunities and Challenges for International Cooperation Around Big Data

The Commons: Framework Draft Implementation

Next Steps

– Determine which BD2K centers are most appropriate for a cloud

Commons pilot

– Develop a plan of action with NCI Cloud pilots

– Working with DDIC/SW Discovery Indices (UIDs, Search)

– Working with Google and AWS (Amazon) to determine what is

needed computationally

• In kind support (short term pilot)

• Conformant clouds (long term sustainable model)

– Developing Use cases!

The CommonsConceptual Framework

Public CloudPlatforms

Page 20: Opportunities and Challenges for International Cooperation Around Big Data

A Business Model forThe Commons

The Commons: Framework Draft Implementation

Page 21: Opportunities and Challenges for International Cooperation Around Big Data

Community – BD2K Awards

Page 22: Opportunities and Challenges for International Cooperation Around Big Data

Community: BD2K Awards Governance

November 3 Kick-off PI Meeting– Emphasis on working groups that span centers and begin

the work of building the ecosystem

• Common API development (with GA4GH)

• Mobile

• Metadata

• Grand challenges

– Emphasize sharing from day 1

– Incentivized to work in the Commons

Page 23: Opportunities and Challenges for International Cooperation Around Big Data

Community Short Term Interactions

NSF Workshops and Dear Colleague letter

Workshop with NOAA on public – private partnerships

ELIXIR Workshop– Standards

– Training

Workshop Inspiring the Game Developer Community to Engage in and Enhance Biomedical Research, Dec 2014

Sustainability of Data Resources 2015

Page 24: Opportunities and Challenges for International Cooperation Around Big Data

1) Build a digital framework for data science training:

NIH Data Science Workforce Development Center

2) Develop short-tem training opportunities: Courses, educational resources, etc.

3) Develop the discipline of biomedical data science and support cross-training

Community: TrainingData Science Training Goals

Goals expanded from recommendations in the June 2012 DIWG and Aug 2013 Training workshop reports.

Page 25: Opportunities and Challenges for International Cooperation Around Big Data

Heads Up on What is Coming in FY15

Calls for using the Commons

Calls for a standards framework development

Calls for software development

Calls to stimulate interactions between communities (diversity, rotations, library)

Calls for high risk, high return projects

Your ideas here…..

Page 26: Opportunities and Challenges for International Cooperation Around Big Data

NIHNIH……Turning Discovery Into HealthTurning Discovery Into Health

[email protected]