data translator: an open science data platform for mechanistic disease discovery

26
Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery Melissa Haendel, PhD @ontowonka

Upload: mhaendel

Post on 23-Jan-2018

117 views

Category:

Health & Medicine


0 download

TRANSCRIPT

Page 1: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

Data Translator: an Open Science Data Platform

for Mechanistic Disease Discovery

Melissa Haendel, PhD

@ontowonka

Page 2: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

Prevailing clinical genomic pipelines

leverage only a tiny fraction of the available

data

Page 3: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

Prevailing clinical genomic pipelines

leverage only a tiny fraction of the available

data

Under-utilized data Loss of discriminatory power

?

Page 4: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

More species = more coverage

Number of human protein-coding genes in ExAC DB as per Lek et al. Nature 2016

19,008

9,739

51%

Page 5: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

More species = more coverage

19,008

78%

14,779

Number of human protein-coding genes in ExAC DB as per Lek et al. Nature 2016

19,008

9,739

51%

Mungall et al Nucleic Acids Research bit.ly/monarch-nar-2016

Page 6: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

More species = more coverage

Even inclusion of just four species boosts phenotypic coverage of genes by 38%

(5189%)

Combined = 89%

19,008

2,195 7,544 7,235 = 16,974 (union of coverage in any species)

Mungall et al Nucleic Acids Research bit.ly/monarch-nar-2016

Page 7: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

PolydactylyTriphalangeal

thumb

Extra thumb

bone

https://radiopaedia.org/cases/triphalangeal-thumb-in-fanconi-anemiaPajni-Underwood, 2007, http://dev.biologists.org/content/134/12/2359

Different communities use different languages

Page 8: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

Challenge: Each data source uses their own

vocabulary/ontology

MP

HP

MGI

HPOA

Page 9: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

Challenge: Each data source uses their own

vocabulary/ontology

ZFA

MPDPO

WPO

HP

OMIA

VT

FYPOAPO

SNOMED

………

WB

PB

FB

OMIA

MGI

RGD

ZFIN

SGD

IMPC

OMIM…

QTLdb

HPOA

EHR

Page 10: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

Can we help machines understand

phenotypes?

“Triphalangealthumb”

Human phenotype

I have absolutely no idea what that means

Page 11: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

Decomposition of complex concepts allows

interoperability

“Triphalangeal

thumb”

Phalanx of manual digit

=

Human phenotype PATO

Uberon

Species neutral ontologies, homologous concepts

Autopod

GO

=

duplicatedembryonic skeletal

system morphogenesis

Page 12: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

Decomposition of complex concepts allows

interoperability

“Triphalangeal

thumb”

Phalanx of manual digit

=

Human phenotype PATO

Uberon

Species neutral ontologies, homologous concepts

Autopod

GO

“Polydactyly”

Mouse phenotype=

duplicatedembryonic skeletal

system morphogenesis

Page 13: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

Example case solved by ExomiserP

he

no

typ

ic

pro

file

Ge

ne

s Heterozygous, missense mutation

STIM-1

N/A

Heterozygous, missense mutation

STIM-1N/A

Stim1Sax/Sax

Ranked STIM-1 variant maximally pathogenic based on cross-species G2P data,

in the absence of traditional data sourceshttps://exomiser.github.io/Exomiser/

bit.ly/stim1paper

Page 14: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

Example case solved by ExomiserP

he

no

typ

ic

pro

file

Ge

ne

s Heterozygous, missense mutation

STIM-1

N/A

Heterozygous, missense mutation

STIM-1N/A

Stim1Sax/Sax

Ranked STIM-1 variant maximally pathogenic based on cross-species G2P data,

in the absence of traditional data sources

bit.ly/stim1paper

In Genomics England 100K Genomes, of first 1936 diagnosed patients, 82% are in the top 5 Exomiser hits across a range

of rare diseases and family structures

Page 15: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery
Page 16: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

Harmonizing diseases, phenotypes, anatomy, and genotypes

91% of our 2.2 Million G2P associations require integrating 2 or more data sources

Page 17: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

Enabling phenotype comparisons across

diseases and species

Page 18: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

Plain language synonyms for computable

phenotypes

Page 19: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

Translational applicability for FA

Tools can support more rapid diagnostics for FA patients

Integration of data enables mechanistic discovery and new candidate gene targets

Identification of models for FA hypothesis validation

Helping patients contribute data and participate in their ongoing evaluation, care, and science

Page 20: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

Acknowledgements

Lawrence Berkeley

Chris Mungall

Suzanna Lewis

Jeremy Nguyen

Seth Carbon

Nicole Washington

Charite

Sebastian Kohler

Garvan

Tudor Groza

Craig McNamara

RTI

Jim Balhoff

Boston Children’s

Ingrid Holm

Catherine Brownstein

John Brownstein

ClinGen

Heidi Rehm

Larry Babb

Harindra Arachchi

OHSU

Matt Brush

Kent Shefchek

Julie McMurry

Tom Conlin

Nicole Vasilevsky

Dan Keith

Maureen Hoatlin

Genomics England/Queen Mary

Damian Smedley

Jules Jacobson

Tomasz Konopka

Pilar Cacheiro

Jackson Laboratory

Peter Robinson

Leigh Carmody

Hannah Blau

EBI

Helen Parkinson

David Osumi-Sutherland

With special thanks to Julie McMurry for excellent graphic design

Johns Hopkins

Chris Chute

Casey Overby

Ada Hamosh

Mayo

Hongfang Liu

Ravi Komandur

UCSC

David Haussler

Benedict Paten

Mark Deikhans

Scripps

Andrew Su

Ben Good

Chunlei Wu

Gregg Stupp

Sanford Health Imagenetics

Neal Boerkoel

Kayli Rageth

Murat Sincan

Page 21: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

www.monarchinitiative.org

Chris Mungall, Peter Robinson, Damian Smedley Funding:

NIH Office of Director: 1R24OD011883; NIH-UDP: HHSN268201300036C, HHSN268201400093P;

NCINCI/Leidos #15X143, BD2K U54HG007990-S2 (Haussler) & BD2K PA-15-144-U01 (Kesselman)

Page 22: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

extra

Page 23: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

Layperson-HPO driven phenotyping tool

https://www.pcori.org/research-results/2017/realization-standard-care-rare-diseases-using-patient-engaged-phenotyping

Page 24: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

Genes Environment Phenotypes

VCF PXFGFF

Standard exchange formats exist for genes … but

for phenotypes? Environment?

BED

Page 25: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

What does a phenopacket look like?

Alacrima Sleep Apnea Microcephaly

phenotype_profile:

- entity: ”patient16"

phenotype:

types:

- id: "HP:0000522"

label: ”Alacrima"

onset:

description: “at birth”

types:

- id: "HP:0003577"

label: "Congenital onset"

evidence:

- types:

- id: "ECO:0000033"

label: ”Traceable Author Statement"

source:

- id: ”PMID:"

Clinical labs Public databases Journals

Page 26: Data Translator: an Open Science Data Platform for Mechanistic Disease Discovery

Layperson HPO + Phenopackets

Dry eyes Stops breathing during sleep Small head

phenotype_profile:

- entity: “Grace”

phenotype:

types:

- id: "HP:0000522"

label: “Alacrima"

onset:

description: “at birth"

types:

- id: "HP:0003577"

label: "Congenital onset"

evidence:

- types:

- id: “ECO:0000033”

label: “Traceable Author Statement"

source:

- id: “

https://twitter.com/examplepatient/status/1

23456789”

• Patient registries• Social media